Posts Tagged ‘data’

Sysomos Partners With Tickr To Bring Business Intelligence To A Whole New Level

Today we’re both proud and excited to announce our new partnership with Tickr, a pioneer in the real-time visualization of enterprise-level data.

As of today, Sysomos will be known as a ‘Preferred Social Intelligence Partner’ in the already expansive Tickr ecosystem.

But what is Tickr and why would this be useful to you?

Before Tickr, marketers, analysts, mangers and others would have to log into dozens of websites so that they could see what is going on with their web analytics, track media, see sales information and, of course, view what is going on with their social media. But Tickr helps to simplify all of that by gathering data from thousands of different sources — including Oracle, SAP and Salesforce, not to mention social networks —and then presents the information in a single custom dashboard or “Command Center” that offers a complete picture of brand and business performance in real time.

“We want to partner with a leader in the social data provider industry, and Sysomos is that leader,” says Tyler Peppel, CEO and Founder of Tickr. “Sysomos enables brands to build towards the next generation of social business intelligence, and we recognize that by working together, we can visualize the enterprise like no one else can.”

As a ‘Preferred Social Intelligence Partner,’ Sysomos will deliver not only the social data that populates the stunning Tickr user interface, but also the insights that power better decision-making at the enterprise level. To put it plainly, the real-time visualization and monitoring of social activity alongside other key performance metrics will create a comprehensive intelligence platform that we feel is unmatched in the industry.

Our CEO, Jim Delaney says, “As a powerful bridge between social data and enterprise metrics, Tickr is an ideal partner for us. Tickr is an extension of Sysomos’ data capabilities, with a customizable user interface that allows us to offer customers rich information in a compelling visual format that businesses and marketers can quickly digest.”

We’re incredibly enthusiatic about this new partnership and think that you should be as well.

If you’d like to learn more about how Sysomos and Tickr can work together to help you improve your analysis, decision making and ultimately, your business, please feel free to contact us and we’re happy to get the conversation started.

For more information about this new partnership, see the official press release here.

How To Present Data So That It Sticks

It was four years ago at SXSW that I first met Eric Swayne. At the time, Eric was a Sysomos client through the agency he was currently working and wanted to meet someone from our team. I gladly agreed and we sat down over lunch and had a very interesting conversation.

I quickly realized that Eric was a pretty smart cookie. Ever since then we’ve kept in touch via emails and Twitter and we always meet up every year in Austin when we return for SXSW. Also, if I’m ever at an event where Eric is speaking, I make sure that I get to his session so I can hear what he has to say.

This year’s SXSW Interactive was no different. I had casually bumped into Eric at a BBQ event and he told me that would be giving a talk two days later. While it wasn’t on my original schedule (simply due to a mistake of not seeing he’d be speaking), I quickly changed my plans for Monday afternoon to make sure I was in attendance.

What I saw was a fantastic presentation entitled Science to Storyteller: Enter the Data Narrator. In this presentation Eric talked about how you present data really matters and how to do it better. I thought this would be quite an interesting topic, especially to people who read this blog, so I took the tweets that came out during his presentation and turned them into a Storfiy story.

Below you’ll find the Storify of Eric’s presentation with highlights that include great tips on presenting data, such as:

  • Seeing something in data and getting others to understand are two completely different things
  • What a data insight is
  • How to find an insight and then make it stick with the people you’re presenting it to
  • Show just the right amount of data
  • And, how to deal with insights and scheduled or automated reports

Here’s what people picked out and tweeted during the presentation:

If you’d like to see the actual presentation, head over to Eric Swayne’s website and see the Prezi presentation in full.

Do you have any tips on presenting data? Leave them in the comments for everyone to see.


200 Billion And Counting

Everyday the world of social media grows exponentially. New people get online. More people discover a social network that they love. People upload more and more media. And of course, the flow of content, whether it’s a tweet, an article, a blog post, a status update or a video, never stops.

You may remember that a mere 8 months ago our social media monitoring and analytics software powered by the powerful Sysomos engine indexed its 100 billionth piece of content. Well, if only to highlight to the world the quick and massive growth of social media use, on Tuesday we indexed our 200 billionth piece of content. That’s 200,000,000,000 written out in numbers.

This means that our customers now have access to over 200 billion social media conversations that they can analyze in mere seconds.

To demonstrate how quickly the rate of social media content grows I decided to conduct a little experiment. I took a bunch of common words (it, its, and, the, what, why, I, a, to, too, or, if, you, your) and looked them up in our MAP software to see how many times they appeared yesterday (May 29, 2013), a year ago (May 30, 2012) and the date we hit 100 billion (September 19, 2012).

The results I found were actually quite interesting and help to demonstrate my point quite nicely.

One year ago, I found 125 million conversations between blogs, online news, forums and Twitter containing my list of common words. By the time September 19th rolled around those same words generated 127 million results. That’s an increase of 2 million posts per day in almost 4 months. Then 8 months later, yesterday, those same words appeared in an astounding 139 million posts. That’s a jump in 12 million pieces of content.

May 30, 2012

September 19, 2012

May 29, 2013

Granted, my list of common words is far from covering the full gambit of what’s out there in social media and the use of these specific words could vary from day to day. However, for illustration purposes, it works well.

As time goes on, more social networks and channels will appear and more people will realize the magic of social media and being able to connect with people around the world. And as that happens, we’re going to keep on capturing and indexing all those conversations to give our customers the largest and most complete sets of social media data.

Using Data To Delight Your Community

Every company or brand out there has its fans and naysayers. It’s just a part of business. One great example where we’ve seen both types prominently over the past few years is the company formerly known as RIM (now known as just BlackBerry).

The company was at one point the leader in smartphone technology. In fact, they were probably the first real smartphone makers in the market. But then other companies like Apple and Google entered the market and some people felt that BlackBerry had been left behind. Fast forward a few years and BlackBerry has made a stunning reemergence in the field with their fully redesigned operating system known as BlackBerry 10.

When BlackBerry announced that it was completely revamping itself from the ground up, it was again met with its fair share of vocal fans and naysayers. For example, take a look at this tweet below that highlights one of the naysayers being countered by one of the very vocal fans:

One thing is for sure; whether it was from a naysayer or a fan, there was a lot of talk leading up to the launch of BB10. Check out this popularity chart below for mentions of BB10 over the past 6 months leading up to the launch. There were over 19 million tweets during this period.

Now, this is where the story gets very cool:

TELUS, a large telecom in Canada, was just as excited about launching the BB10 line on their network as some of the super fans out there were. They also knew how excited a lot of their customers were for the new BlackBerry devices. That’s why they decided to reward a lucky customer who was the most eager (and persistent) for the big release.

Using our MAP platform, TELUS was able to analyze millions of conversations about BlackBerry and BB10 from across Canada to find the people that were talking the most positively about their excitement for the new smartphone. By cross-referencing the top BB10 anticipators with their client records, they were able to grant one lucky customer’s wish of being one of the first people in the world to own a BlackBerryZ10 device.

This is a great example of how companies can use big data (both from social media and from their own databases) to show their customers and fan base that they’re listening and that they care what they think.

Dan Fricker, TELUS’s Social Media Community Manager, had this to say:

“Social media’s one of our many ways of connecting and actually having conversations with customers. What Sysomos offers is an incredible way to listen to those conversations, from different people all over the country. Beyond engaging in real-time interactions, we can also go back and see what people have been anticipating most about the launch of BlackBerry 10, for example, or who’s been talking about this new device the longest. That’s the case with @Im_Sure_ who’s been tweeting with @TELUS about the BB10 for weeks. With tools like Sysomos and the power of social we can engage in customer conversations like Matt’s, arguably the BB10’s #BestFan. Given Matt’s such a big BlackBerry fan, we surprised him with his very own Z10 today.”

Data, Data and More Data

This blog post is first in the “Engineering” series by Sysomos’ co-founder and CTO Nilesh Bansal. As part of this series, Nilesh will share experiences in engineering Sysomos’ social media platform.

One question that is frequently asked is: What’s the biggest challenge I face? The simple answer is: data.

As I write this sentence, in less than a minute, our crawlers have collected tens of thousands of new conversations happening online. Within the same minute, each of these conversations was discovered, retrieved, cleaned, analyzed, and stored on our servers. Now, that is a lot of data.

We store billions of documents on our servers. Every hour, millions of them are read and analyzed by users. In the last few years, our team has experimented with a variety of options to get a better understanding of the black art of data management. I’ll share some of them in this post.

There are two main components of the storage layer: hardware and software. As well, there’s a third option: outsourcing by using Amazon’s cloud infrastructure, S3 and EC2. Cloud storage is a convenient option, and, if used properly, even economical. But convenience comes at the price of flexibility.

While Amazon has steadily added more customization options and features, there still isn’t enough flexibility to meet our needs. The lack of flexibility also limits our options to innovate such as our plans to start using solid-state drives. As a result, we have stayed with conventional on-premise solution.

Storage Hardware Disks and storage bays are the most expensive part of a purchase order. They are also the slowest and the least reliable. This means they have to be selected carefully.

There are three main architecture options. First, network mounts and NAS obviously will not work given the low latency requirements. Second, fiber-based SAN offers flexibility in adding new disk arrays or moving them across hosts, but is significantly more expensive. If planned properly, this flexibility is not really needed.

The last option, which I prefer among the three, is internal and direct attached storage. If I had to select one configuration option, I would go with a 2U server with 12 bays containing 1TB SATA disks, 60-80GB RAM, and 16 processing cores. This provides a good balance of computing power and storage space. Adding more disks is easy by adding an external disk array connected via SAS cable.

Reliability Disks fail, and when they fail all the data is lost. RAID is used to store multiple copies of the same data on different disks to ensure reliability. RAID 5 is the most commonly used option. However, the disk sizes have increased exponentially and the bit error rates still remain at the same level, there is a non-zero chance of data loss in RAID 5. RAID 6 adds an extra disk to RAID 5 to provide higher reliability. The data write speeds in both RAID 5 and 6 is slow and not best for what we do.

We use RAID 1+0 where all data is mirrored on two different disks. Since all data is stored twice on two separate disks, it means twice the cost but it also provides the best reliability and high performance.

Storage Software As our crawlers continue to add more data every minute, and our users analyze thousands of documents every second, data storage is an important consideration. While we use a combination of different solutions from flat files and custom data structures to inverted indexes, key value pairs and relational database, the bulk of our data is handled by MySQL.

For the most part, MySQL is used as a simple key value store. Since a single instance of MySQL can’t hold all the billions of documents within Sysomos, we partition the data logically across several big, fat servers. Each server is maintained independently (as NDB cluster does not really scale) using primarily the InnoDB table format. Inside each instance, we further partition the data logically to hundreds of different tables. This partitioning let us add new data without hitting the wall.

While MySQL is good enough for basic SELECT and INSERT operations, this is all it can do. Even thinking of a JOIN or any complicated operation can make the server crash. But as long as it is used as a key-value store, MySQL can handle a lot of data and provide for all replication and backup needs.

Key-Value Stores New generation of data stores are gaining popularity. Most notable ones are Apache’s HBase, Facebook’s Cassandra, LinkedIn’s Project Voldemort and Baidu’s Hypertable. Each of these have big-name backers with a lot of hype, and are trying to do what Google does with BigTable.

But they have to become more mature before they become useful for us. For example, when HBase crashes (and it does), it prints the most uninformative error log. Hash based partitioning is used for load balancing, which provides little visibility in where the data actually is, and is often less optimal than logical partitioning when it comes to latency. There is also  very limited user base for each of these outside of there parent companies (which also means bad documentation).

Tokyo Tyrant is another option because it is simple, fast and good for specialized needs.

In summary, it’s all about data. More data means, more we can do with it (and sleep less). I will explore some more topics, including real-time indexing, sentiment analysis, and load balancing, in my next posts, so stay tuned.