“Data creation is exploding,” says Gavin Belsen. “With all the selfies and useless files people refuse to delete on the cloud, 92 percent of the world’s data was created in the last two years alone.At the current rate, the world’s data storage capacity will be overtaken by next spring. It will be nothing short of a catastrophe. Data shortages, data rationing, data black markets. Someone’s compression will save the world from data-geddon, and it sure as hell better be Nucleus and not goddamn Pied Piper!”
Belsen is a fictional character, of course, from the HBO show Silicon Valley. But like a lot of the show’s satire, his over-the-top rant hints at a deeper truth: in the big data era, storage really is becoming a problem. As companies adopt big data analytics, they’re struggling to deal with the mountains of data they’re producing. Compression isn’t making them small enough, and compressed files aren’t searchable, which makes accessing them to take advantage of “big data” a big pain in the ass.
Pied Piper, the fictional #startup headed by a tech genius who created a revolutionary algorithm from Silicon Valley, doesn’t exist in real life. But Terark does. It’s nowhere near Silicon Valley – the startup is based in Beijing – but fans of the show will find a lot familiar about the Chinese compression startup. Built on the back of a revolutionary algorithm created by a brilliant but (sorry Lei Peng!) somewhat socially awkward engineer, Terark is taking on what Gavin Belsen called “data-geddon” – and aiming to beat some global tech powerhouses in the process.
“Our technology allows us to make big data smaller,” Terark VP Remy Tricard told me, and that’s the short version. The longer version is that CTO Lei Peng developed an incredibly efficient compression algorithm when he was messing around on a passion project, trying to speed up with Chinese text input methods. That seemed like a commercial dead-end, but applying Lei’s algorithm elsewhere proved more promising. Now, Terark is a database compression company, and thanks to Lei’s algorithm, it’s blowing global competitors like Facebook and Google out of the water – at least in terms of performance.
According to Terark’s site, TerarkDB – the company’s database storage engine – boasts random read speeds more than 230 times faster than Facebook’s RocksDB. Its compression ratio is way lower, and its latency is lower. In MySQL and Mongo databases, Terark’s tech is similarly dominant. The cherry on top of it all is that Terark’s tech makes compressed data searchable without decompressing it, which greatly reduces server load.
If that sounds like little more than a startup’s self-promoting hype, consider this: Terark has a team of just 10 people, it has barely been around two years, and the company is already profitable. Moreover, it has attracted high-profile clients like Alibaba, which has a US$1 million contract with Terark to use its tech for Alibaba Cloud.
When I asked Lei how a tiny startup was able to land such a major client so quickly, he said it was simple. Big companies have to deal with big amounts of data. Any technology that can reduce server and storage loads will save them a small fortune. That alone has apparently been enough to land Terark some of China’s biggest cloud computing fish (it also has a contract with Qing Cloud), but now the company is also looking overseas. Deals with other major cloud providers both in China and in the US are in the pipeline, Terark says, although no one was willing to name specific names.
Anyone who’s watched Silicon Valley would probably be hesitant to say that the future looks bright for Terark. Just when things start to go well for Pied Piper, something else falls apart. But that’s just a TV show. The real-world Pied Piper is already profitable, isn’t interested in VC money (at least for the moment), and is focused on expanding its database business, confident that it has the best product on the market. But even if that doesn’t work out, the company still has Lei’s algorithm – and if we’ve learned anything from Silicon Valley, it’s that a great algorithm can be enough to keep you in the game even when everything goes wrong.
[Full disclosure: Terark is a Y Combinator graduate; so is Tech in Asia.]