[Point of VIEW] 3 Engineering Data Challenges (and Why Your Tools Can’t Solve Them)

Tuesday, June 30, 2015

Today’s post is part of a series exploring areas of focus and innovation for NI software.


Phillips headshot.jpg


Today’s Featured Author

Omid Sojoodi is currently the leader of application
and embedded software for National Instruments.





With the rise of the Industrial Internet of Things, one thing is clear: engineers need to extract meaningful information from the massive amounts of machine data collected.


Data from machines, the fastest growing type of data, is expected to exceed 4.4 zettabytes (that’s 21 zeros) by 2020. This type of data is growing faster than social media data and other traditional sources. This may sound surprising, but when you think about those other data sources, which I call “human limited,” consider that there are only so many tweets or pictures a person can upload throughout the day. And there are only so many movies or TV shows a person can binge watch on Netflix to get to the next set of recommendations. But machines can collect hundreds or even thousands of signals 24/7 in an automated fashion. In the very near future, the data generated by our more than 50 billion connected devices will easily surpass the amount of data humans generate.


The data that machines generate is unique, and big data analysis tools that work for social media data or traditional big data sources just won’t cut it for engineering data. That is why NI is investing in tools to help you overcome common challenges and make data-driven decisions based on your engineering data (no matter the size) confidently.


Challenge 1: 78 percent of data is undocumented.

According to research firm International Data Corporation (IDC), “The Internet of Things will also influence the massive amounts of ’useful data’—data that could be analyzed—in the digital universe. In 2013, only 22 percent of the information in the digital universe was considered useful data, but less than 5 percent of the useful data was actually analyzed.”


Data that is considered useful includes metadata or data that is tagged with additional information. No one wants to open a data source and wonder what the test was, what the channels of information are called, what units the data was collected in, and so on. NI is helping to resolve this issue with our Technical Data Management (TDM) data model. With it, you can add an unlimited number of attributes for a channel, a group of channels, or the entire file. We are constantly updating the infrastructure of this binary (but open) data file, and have recently reached streaming rates of 13.6 GB/s. To make documenting data easier, NI is investing in technologies that will recommend metadata to save with your raw data while offering you the flexibility to add attributes at any point before, during, or after acquisition.


Challenge 2: The average NI customer uses three to five file types for projects.

With so many custom solutions on the market, your current application likely involves a variety of vendors to accomplish your task. Sometimes these vendors require you to use closed software that exports in a custom format. Considered a common pain point, aggregating data from these multiple formats often requires multiple tools to read and analyze the data. NI addresses this challenge with DataPlugins, which map any file format to the universal TDM data model. Then you can use a single tool, such as LabVIEW or DIAdem, to create analysis routines. To date, NI has developed over 1,000 DataPlugins. If one isn’t readily available, NI can write a DataPlugin for you.


Challenge 3: It takes too long to find the data you need to analyze.

The Aberdeen Master Data Management research study interviewed 122 companies and asked how long it takes to find the data they need to analyze. They answered five hours per week! That’s just looking for the data—not analyzing it. From an engineering perspective, this to me is not that shocking. How many of us have faced what I consider to be the ”blank VI syndrome” for data? How do you even begin to start analyzing your data?





A little-known technology that NI continues to invest in is DataFinder. DataFinder indexes any metadata included in the file, file name, or folder hierarchy of any file format. Again, this relies on a well-documented file, but by now I’m sure you have decided to use TDM for your next application.


Once the metadata has been indexed, you can perform queries—either text-based, like you would in your favorite search engine, or conditional queries like in a database—to find data in seconds. With this advanced querying, you can return results at a channel level to track trends in individual channels from multiple files over time.


In addition, NI is continuing to innovate to make analyzing your data easier than ever. Imagine a future when, as soon as a file is saved, the DataFinder recognizes the data, indexes the metadata, and cleanses the raw data (normalizes channel names so that rpm = speed = revs or performs statistical calculations automatically). Then an analysis routine, written in your language of choice, acts on each data file and automatically archives the data or sends a report to your email or mobile device. This technology ensures that your data-driven decisions are being made with 100 percent of the data and not just 5 percent, as IDC estimates suggest today.


Stay tuned, everyone.