Associate Professor, Particle Physics, Niels Bohr Institute, University of Copenhagen
Applied Machine Learning and Big Data Analysis
This course is a part of
Machine Learning is entering essentially all data-based fields, and Big Data is omnipresent from private industries to governmental organizations. It is a new approach to problem solving, and while the potential is often exaggerated, Machine Learning does indeed introduce new opportunities, but it also poses some very real challenges. The ability to analyze and combine large amounts of data from different sources has obvious applications. However, the lack of quality in the data combined with a high variance means that conventional analysis often fails, while Machine Learning algorithms are less affected, if trained and used correctly.
This course will bring you to the forefront of the field of applied machine learning by introducing you to the newest tools and methods in large-scale data analysis based on cutting-edge research and the extensive experience.
Best course I ever been to.
Course directors - Applied Machine Learning and Big Data Analysis
Course details - ML and Big Data Analysis
Key benefits - adv. tools for data cleaning and analysis
After the course you will:
- Be able to set up a basic Machine Learning Analysis from beginning to end: from retrieving and cleaning the data, to establishing the information level, extracting patterns and finding outliers, to curating the necessary data.
- Be acquainted with a number of advanced tools for data cleaning, statistical analysis of very large datasets, data stream analysis, finding patterns and outliers in Big Data, collecting data from instruments and devices (e.g. internet of things (IoT)) and for hardware systems design for efficient handling and analysis of Big Data.
Course content - Machine Learning Analysis from beginning to end
Throughout the course, we will use examples of structured datasets in a commercial context, which will be used to demonstrate the different steps in Big Data Analysis. Participants will also have the chance to ask questions about specific data and challenges.
Core elements
- Data cleaning and statistical methods: Detecting and correcting (or removing) corrupt or inaccurate records, and robust statistical methods for data with very large variance and cross checks.
- Machine Learning algorithms: Introduction to a variety of methods, how they work behind the scenes, their strengths and weaknesses, and their applications.
- Finding patterns and outliers in Big Data: Which methods can be used to identify sparse patterns in very large datasets, and how can we identify data that does not follow the general pattern of a dataset?
- Collecting data from instruments and devices: How to collect, store, and analyze data from a multitude of sources (e.g. apparatus, IoT, etc.).
- Systems for Big Data Analysis: Hadoop, PyDisco, etc., and hardware systems design for efficient analysis.
- Selected machine learning algorithms for large-scale data: Random forests, (deep) neural networks, support vector machines, and large-scale exact nearest neighbour search.
- Systems for Big Data Analysis: Common systems for BDA; Hadoop, PyDisco, etc., and hardware systems design for efficient BDA.
Tools/methods introduced
- Selected machine learning algorithms for large-scale data: Random forests, support vector machines, and large-scale exact nearest neighbour search
- Data curation: How to select data for long time curation, systems, techniques and standards for data curation
We will primarily be working with Python; however, all techniques that are covered are easily implemented with all standard data-analysis languages.
Participant profile
The course is strictly focused on Machine Learning and Big Data Analysis, so a prerequisite is that you have a background in statistics and/or conventional data analysis. This course assumes you have studied to at least Bachelor degree level and/or have several years of data analysis experience.
Location
University of Copenhagen
South Campus, Faculty of Law
Njalsgade 76
DK-2300 Copenhagen S
Denmark
Contact
Copenhagen Summer University
csu@adm.ku.dk
+45 3533 3423
Time and Date
14-18 August 2023
09:00-16:30
Really good teachers, very good at explaining and applying it to real data/problems.
Variety of methods covered with examples from real world.
Very good general overview of ML, I fell much more confident in applying the techniques in my projects..