As anyone who likes to spend their time reading research blogs knows, Google have recently made their TensorFlow-based parser, SyntaxNet, open source. SyntaxNet is, for the moment, the world’s most accurate natural language parser. The version that parses the English language is quite excellently named Parsey McParseface. It is trained to analyse English text, and is able to explain the functional role of each word in a given sentence.
As exciting as this is, we mustn’t believe this recent development from Google to be earth-shattering news. Academia and artificial intelligence departments have been using highly accurate parsers like spaCy for years. Parsey McParseface is about 94% accurate at 600 words per second, whilst spaCy is around about 92% accurate at 15,000 words per second. SyntaxNet, thanks to its accuracy, will make about 20% fewer errors than its closest competitor, spaCy (although spaCy is faster). The Stanford Parser is also particularly worth mentioning.
Without having to go heavily into the technical details (you can read them here), we can ask what use does this have? It turns out that Parsey has lots of uses and, as it’s only just been released to the public, the applications haven’t been fully realised yet. I would like to tell you about these applications, real and potential, and how they will begin to affect the world of search, marketing, and social media (and not just the possibility of Skynet, which SyntaxNet sounds eerily similar to).
Sentiment analysis is used to determine a particular speaker’s or writer’s attitude towards a particular subject. There are three main methods: statistical, knowledge-based, and hybrid.
Metacritic, Rotten Tomatoes, Epinions, IMDb, and TestFreaks are good examples of aggregator websites that use sentiment analysis extensively. However, as good as these websites are, how accurate can they truly be? Sarcasm, terseness, and general language ambiguity are still difficult concepts for machines to learn about, as are longer sentences. Recursive Deep Models have helped overcome such problems, but nuance in sentences is still difficult for machines to pick up.
SyntaxNet, thanks to its accuracy, will help make these nuances easier to pick up, leading to more accurate analysis on a subject. We can already look at Twitter to gauge political sentiment. We may one day be able to predict new leaderships for a country before they even happen!
Have you ever found an amazing bit of content online, but struggled to find out who made it? Well, with Parsey McParseface finding out who wrote something is made much easier. We can even make highly accurate guesses, as we are more accurately able to see a particular style of writing. This is known as stylometry.
The more accurate the parser, the more accurately we can attribute something to a particular author. This can help us discover forgeries, instances of plagiarism, and even crime investigation (think harassment and terror plots).
There is still the problem of bias, as neural nets conform to their training set and therefore select authors it may have already analysed. Applications like SyntaxNet can ensure that a more precise picture of a writer’s style can be painted. We may even be able to figure out who’s speaking to us in real-time chats, so fake profiles and the like become easier to spot.
Imagine if every website, article, advert, and review had a percentage score telling you how credible it was. It’s possible that Parsey McParseface’s technology could offer this, by analysing the arguments being made and fact-checking on the fly and comparing against other sources, as well as flagging patterns of language and speech that frequently correlate with spurious claims.
This could be a game-changer for journalists, politicians, religious groups, and advertisers, who may have their power taken away from them as their audience quickly learns exactly how trustworthy they are. Not to mention businesses who publish fake reviews to lure potential customers in. Suddenly, the readers are a whole lot more informed, without having to go through the laborious process of fact-checking everything they read themselves.
Imagine being able to go to another country and having a phone app that can translate a foreign language to an almost-perfect degree in real-time. This has been a dream for many over the years and, the more accurate the parser, the closer we get to realising it.
Business and social interactions become a heck of a lot easier. A solid universal translator can even help make learning languages simpler, as it picks up on how words are used in specific contexts. Dialect, slang, and other idiosyncracies that can throw a beginner off are accounted for, unlike more traditional learning methods. A Neuromancer-style chip that can be attached to the ear is all too possible.
As great as Google search is, it still has its limits. Every addition to a search term reduces SERP accuracy. Moreover, there are hundreds of websites, all using different content management systems, which can easily prevent the chances of finding the exact result you’re looking for.
With a precise parser, these problems are diminished. Search engines will begin to know – and even predict – your search query with startling specificity. This means that entire phrases can become a string of keywords, rather than just single words.
Marketeers can use this to help open up competition on search results for particular keywords and their synonyms, and make content targeted at a specific phrase instead.
Feeling lonely? Ever wished that Siri could actually talk to you, or at least follow your demands properly? Perhaps you want something similar to the computer in Star Trek, capable of retrieving and analysing huge amounts of information in seconds?
Well, with a computer that is able to do this, you may just well be able to have a computer that learns and reacts to your speech. The computer will learn about what you like to talk about and have potentially meaningful conversations with you. (Just to note: even most of Starfleet’s computers aren’t capable of this yet, being confounded by topics on love and the human condition.)
The implications of this are wider than you might think. Yes, it could be one-step closer to an artificial girl/boy/intersex friend, but it could also mean machines are able to carry out complex assignments (e.g. microsurgery or structural engineering) using voice commands. This equals a world where computers and humans are quite fully integrated.
Get more great stuff delivered fresh to your inbox.