Gradient Flow’s annual NLP industry survey sheds light on the practices, technologies, and challenges defining natural language processing this year
Spark NLP is the most widely used NLP library among respondents.
LEWES, Del., Sept. 21, 2021 (GLOBE NEWSWIRE) – John Snow Labs, the healthcare AI and NLP company and developer of the Spark NLP Library, today announced the results of the Natural Language Processing (NLP) Industry Survey 2021to find out how companies are using NLP today. The results include a detailed analysis of the NLP technologies implemented by companies, budgets, trends, widely used tools and cloud platforms, and use cases. The survey was carried out by Gradient flow, an independent provider of data science analytics and insights.
Despite responses from a variety of industries, company sizes, stages of NLP adoption, and geographic locations, the global survey showed that NLP budgets are growing across the board. In fact, 60% of tech leaders said their NLP budget had increased by at least 10%, while 33% reported a 30% increase and 15% said their budget had more than doubled. This is a steady increase compared to 2020, suggesting that pandemic-related financial restrictions may stabilize.
While investments in NLP have been healthy, practitioners face some significant barriers to progress. Similar to last year’s results, accuracy was the number one requirement when evaluating an NLP solution. When asked about the key challenges in using cloud NLP services, however, tech leaders specifically cited difficulties in matching (39%) and costs (36%) as the top two challenges. This is important because models often need to be fine-tuned and customized for their specific domains and applications. As more difficult use cases such as questions and answers and natural language generation increase, accuracy will remain of the utmost importance to success.
Other important findings are:
For the second year in a row, Spark NLP was voted the Most Popular NLP Library. 31% of the respondents said they use them.
Most practitioners use multiple libraries. In fact, 53% of respondents said they were using at least one of the following NLP libraries popular in the Python ecosystem: Hugging Face, spaCy, Natural Language Toolkit (NLTK), Gensim, or Flair.
For technology leaders, accuracy (40%) was the most important requirement when evaluating an NLP solution, followed by readiness for production (24%) and scalability (16%).
54% of tech leaders named Named Entity Recognition (NER) and 46% named document classification as the primary use case for NLP.
Among the respondents from the healthcare industry, entity linking / knowledge graphs (41%) and de-identification (39%) were among the most common use cases.
83% of all survey participants stated that they use at least one of the four listed NLP cloud services (Google, AWS, Azure, IBM) in addition to NLP libraries.
The three most important data sources for NLP projects are text fields in databases, files (PDFs, docx, etc.) and online content.
The top 4 industries represented by survey respondents that use NLP include healthcare (17%), technology (16%), education (15%), and financial services (7%), which is the level of widespread adoption of the industry.
The Spark NLP library is particularly dominant in the healthcare sector, where 60% of respondents said they had adopted it.
“As we move into the next phase of NLP growth, it is encouraging to see investments and use cases grow, with mature companies taking the lead,” said Dr. Ben Lorica, survey co-author and chair of the external program, NLP Summit. “After the political and pandemic-related uncertainty of the past year, it is exciting to see such progress and potential in an area that is still very much in its infancy.”
The full results of the 2021 NLP survey can be downloaded here and will be on the coming NLP summit in a keynote presentation entitled “Industry Survey Analysis: Natural Language Use Cases in the Industry in 2021. âTo further explore some of the key trends and developments in NLP, visit John Snow Labs and speakers from leading companies such as Google, Microsoft, Roche, Kaiser Permanente and others at the NLP Summit.
Free registration is now open for the online conference, which will take place October 5-7. Please send press inquiries to [email protected]. follow @JohnSnowLabs or #NLPSummit on Twitter for the latest news and updates.
About John Snow Labs
John Snow Labs, the healthcare AI and NLP company, provides cutting-edge software, models, and data to help healthcare and life science companies make sense of AI. Developer of Spark NLP, the world’s most widely used corporate NLP library, the award-winning clinical NLP software from John Snow Labs powers leading healthcare and pharmaceutical companies such as Kaiser Permanente, McKesson, Merck and Roche. The company is the creator and host of The NLP Summitwho train and promote the NLP community.
For media inquiries:
John Snow Labs
A graphic accompanying this announcement is available at https://www.globenewswire.com/NewsRoom/AttachmentNg/6ed013a1-2c4f-41ea-b0a3-33f4e3f810c8