Looking for partners for H2020-ICT-2018-2020: TOPIC : Big Data technologies and extreme-scale analytics
I would like to submit an RIA proposal for developing new analytics tools for data described by distributions. I am a statistician, and I am working on this topic.
Today, data are produced in a very high size from sensors, surveys, and IOT. The main idea is to store data by means of summaries (this is common in datastream, environments or in the analysis of high throughput data). The idea is to represent summaries of bunches of data by means of distributions for each feature, and these constitute the input of new data analytics tools (basic statistics, visualization, dimension reduction, clustering, classification, regression and time series analysis) that extends classical analytics tools to the case of such new data. The idea is inspired by the Symbolic Data Analysis approach to knowledge extraction. In Europe, there are some research institutes and universities working on this topic. Software in R is also produced (HistDAWass package, is one of this). An interesting point is the integration of distributions of different nature (categorical and numerical) for the analysis of such new data. Applications have been proposed in different fields (wireless sensor networks, official statistics, datastream analysis).
For further details, please feel free to contact me email@example.com