The project was developed under the auspices of "Data4Impact" H2020 project, which aimed to access the performance of EU and national research and innovation system. In collaboration with "Athena" Research Center, the project attempts to employ a network approach on the matter and develop a multi-layer graph infrastructure in order to assess the societal impact of health-related research in Europe for the past 10 years using Data Mining and Machine Learning techniques.
[Data4Impact Homepage] [Deliverable (PDF)]We developed a Convolutional Neural Network (CNN) that would take the spectrograms of the initial audio clips and perform a multi-label classification task with 50+ potential tags. This way, we easily (1) label audio clips avoiding hand-engineered features like Mel Frequency Cepstral Coefficients (MFCCs) which require expert knowledge and (2) estimate the similarity between audio clips or music songs (by computing the number of overlapping tags).
[GitHub Repository]By subsequently applying shingling (on a word level), minhashing and Locality Sensitive Hashing (LSH) we managed to be able to compute Jaccard similarity among text corpora. This way, we can (1) retrieve similar documents for any given document and (2)effectively apply the same methodology in significantly large textual collections.
[GitHub Repository]Imagine you have some data-related problem that you want to solve. You have heard of all the amazing things that machine learning algorithms can achieve and want to try it for yourself — but you have no prior experience or knowledge in this area. You start googling some terms like “machine learning models” and “machine learning methodologies,” but after some time, you find yourself ready to give up, completely lost somewhere between the different algorithms.
Read More at DZoneAll of us have at some point worked with some spreadsheet software, like Excel or Google Sheets, or BI tools and we have to admit that they offer certain functionalities that are very handy when it comes to data presentation and reporting, like the so-called pivot tables. Since many business applications require some sort of pivot tables, I am sure many of you have found themselves struggling with how to satisfy these requirements using a database instead of a spreadsheet.
Read More at DZoneWhen I first came across table partitioning and started searching, I realized two things. First, it is a complex operation that requires good planning. Second, in some cases, it can be proven extremely beneficial, while in others, it can be a complete headache.
Read More at DZone