Data Mining Training Courses

Data Mining Training

Data Mining refers to the process of automatically searching large data sets to discover patterns and useful information.

NobleProg onsite live Data Mining training courses demonstrate through hands-on practice the fundamentals of Data Mining, its sources of methods including Artificial intelligence, Machine learning, Statistics and Database systems, and its use and applications.

Data Mining training is available in various formats, including onsite live training and live instructor-led training using an interactive, remote desktop setup. Local Data Mining training can be carried out live on customer premises or in NobleProg local training centers.

Client Testimonials

Subcategories

Data Mining Course Outlines

Code Name Duration Overview
processmining Process Mining 21 hours Process mining, or Automated Business Process Discovery (ABPD), is a technique that applies algorithms to event logs for the purpose of analyzing business processes. Process mining goes beyond data storage and data analysis; it bridges data with processes and provides insights into the trends and patterns that affect process efficiency.  Format of the course     The course starts with an overview of the most commonly used techniques for process mining. We discuss the various process discovery algorithms and tools used for discovering and modeling processes based on raw event data. Real-life case studies are examined and data sets are analyzed using the ProM open-source framework. Audience     Data science professionals     Anyone interested in understanding and applying process modeling and data mining
kdd Knowledge Discover in Databases (KDD) 21 hours Knowledge discovery in databases (KDD) is the process of discovering useful knowledge from a collection of data. Real-life applications for this data mining technique include marketing, fraud detection, telecommunication and manufacturing. In this course, we introduce the processes involved in KDD and carry out a series of exercises to practice the implementation of those processes. Audience     Data analysts or anyone interested in learning how to interpret data to solve problems Format of the course     After a theoretical discussion of KDD, the instructor will present real-life cases which call for the application of KDD to solve a problem. Participants will prepare, select and cleanse sample data sets and use their prior knowledge about the data to propose solutions based on the results of their observations.
druid Druid: Build a fast, real-time data analysis system 21 hours Druid is an open-source, column-oriented, distributed data store written in Java. It was designed to quickly ingest massive quantities of event data and execute low-latency OLAP queries on that data. Druid is commonly used in business intelligence applications to analyze high volumes of real-time and historical data. It is also well suited for powering fast, interactive, analytic dashboards for end-users. Druid is used by companies such as Alibaba, Airbnb, Cisco, eBay, Netflix, Paypal, and Yahoo. In this course we explore some of the limitations of data warehouse solutions and discuss how Druid can compliment those technologies to form a flexible and scalable streaming analytics stack. We walk through many examples, offering participants the chance to implement and test Druid-based solutions in a lab environment. Audience     Application developers     Software engineers     Technical consultants     DevOps professionals     Architecture engineers Format of the course     Part lecture, part discussion, heavy hands-on practice, occasional tests to gauge understanding
mdlmrah Model MapReduce and Apache Hadoop 14 hours The course is intended for IT specialist that works with the distributed processing of large data sets across clusters of computers.
BigData_ A practical introduction to Data Analysis and Big Data 35 hours Participants who complete this training will gain a practical, real-world understanding of Big Data and its related technologies, methodologies and tools. Participants will have the opportunity to put this knowledge into practice through hands-on exercises. Group interaction and instructor feedback make up an important component of the class. The course starts with an introduction to elemental concepts of Big Data, then progresses into the programming languages and methodologies used to perform Data Analysis. Finally, we discuss the tools and infrastructure that enable Big Data storage, Distributed Processing, and Scalability. Audience Developers / programmers IT consultants Format of the course Part lecture, part discussion, hands-on practice and implementation, occasional quizing to measure progress.
sspsspas Statistics with SPSS Predictive Analytics Software 14 hours Goal: Learning to work with SPSS at the level of independence The addressees: Analysts, researchers, scientists, students and all those who want to acquire the ability to use SPSS package and learn popular data mining techniques.
ApHadm1 Apache Hadoop: Manipulation and Transformation of Data Performance 21 hours This course is intended for developers, architects, data scientists or any profile that requires access to data either intensively or on a regular basis. The major focus of the course is data manipulation and transformation. Among the tools in the Hadoop ecosystem this course includes the use of Pig and Hive both of which are heavily used for data transformation and manipulation. This training also addresses performance metrics and performance optimisation. The course is entirely hands on and is punctuated by presentations of the theoretical aspects.
datamin Data Mining 21 hours Course can be provided with any tools, including free open-source data mining software and applications
matlabfundamentalsfinance MATLAB Fundamentals + MATLAB for Finance 35 hours This course provides a comprehensive introduction to the MATLAB technical computing environment + an introduction to using MATLAB for financial applications. The course is intended for beginning users and those looking for a review. No prior programming experience or knowledge of MATLAB is assumed. Themes of data analysis, visualization, modeling, and programming are explored throughout the course. Topics include: Working with the MATLAB user interface Entering commands and creating variables Analyzing vectors and matrices Visualizing vector and matrix data Working with data files Working with data types Automating commands with scripts Writing programs with logic and flow control Writing functions Using the Financial Toolbox for quantitative analysis
bdbiga Big Data Business Intelligence for Govt. Agencies 35 hours Advances in technologies and the increasing amount of information are transforming how business is conducted in many industries, including government. Government data generation and digital archiving rates are on the rise due to the rapid growth of mobile devices and applications, smart sensors and devices, cloud computing solutions, and citizen-facing portals. As digital information expands and becomes more complex, information management, processing, storage, security, and disposition become more complex as well. New capture, search, discovery, and analysis tools are helping organizations gain insights from their unstructured data. The government market is at a tipping point, realizing that information is a strategic asset, and government needs to protect, leverage, and analyze both structured and unstructured information to better serve and meet mission requirements. As government leaders strive to evolve data-driven organizations to successfully accomplish mission, they are laying the groundwork to correlate dependencies across events, people, processes, and information. High-value government solutions will be created from a mashup of the most disruptive technologies: Mobile devices and applications Cloud services Social business technologies and networking Big Data and analytics IDC predicts that by 2020, the IT industry will reach $5 trillion, approximately $1.7 trillion larger than today, and that 80% of the industry's growth will be driven by these 3rd Platform technologies. In the long term, these technologies will be key tools for dealing with the complexity of increased digital information. Big Data is one of the intelligent industry solutions and allows government to make better decisions by taking action based on patterns revealed by analyzing large volumes of data — related and unrelated, structured and unstructured. But accomplishing these feats takes far more than simply accumulating massive quantities of data.“Making sense of thesevolumes of Big Datarequires cutting-edge tools and technologies that can analyze and extract useful knowledge from vast and diverse streams of information,” Tom Kalil and Fen Zhao of the White House Office of Science and Technology Policy wrote in a post on the OSTP Blog. The White House took a step toward helping agencies find these technologies when it established the National Big Data Research and Development Initiative in 2012. The initiative included more than $200 million to make the most of the explosion of Big Data and the tools needed to analyze it. The challenges that Big Data poses are nearly as daunting as its promise is encouraging. Storing data efficiently is one of these challenges. As always, budgets are tight, so agencies must minimize the per-megabyte price of storage and keep the data within easy access so that users can get it when they want it and how they need it. Backing up massive quantities of data heightens the challenge. Analyzing the data effectively is another major challenge. Many agencies employ commercial tools that enable them to sift through the mountains of data, spotting trends that can help them operate more efficiently. (A recent study by MeriTalk found that federal IT executives think Big Data could help agencies save more than $500 billion while also fulfilling mission objectives.). Custom-developed Big Data tools also are allowing agencies to address the need to analyze their data. For example, the Oak Ridge National Laboratory’s Computational Data Analytics Group has made its Piranha data analytics system available to other agencies. The system has helped medical researchers find a link that can alert doctors to aortic aneurysms before they strike. It’s also used for more mundane tasks, such as sifting through résumés to connect job candidates with hiring managers.
matlabdsandreporting MATLAB Fundamentals, Data Science & Report Generation 126 hours In the first part of this training, we cover the fundamentals of MATLAB and its function as both a language and a platform.  Included in this discussion is an introduction to MATLAB syntax, arrays and matrices, data visualization, script development, and object-oriented principles. In the second part, we demonstrate how to use MATLAB for data mining, machine learning and predictive analytics. To provide participants with a clear and practical perspective of MATLAB's approach and power, we draw comparisons between using MATLAB and using other tools such as spreadsheets, C, C++, and Visual Basic. In the third part of the training, participants learn how to streamline their work by automating their data processing and report generation. Throughout the course, participants will put into practice the ideas learned through hands-on exercises in a lab environment. By the end of the training, participants will have a thorough grasp of MATLAB's capabilities and will be able to employ it for solving real-world data science problems as well as for streamlining their work through automation. Assessments will be conducted throughout the course to gauge progress. Format of the course Course includes theoretical and practical exercises, including case discussions, sample code inspection, and hands-on implementation. Note Practice sessions will be based on pre-arranged sample data report templates. If you have specific requirements, please contact us to arrange.
d2dbdpa From Data to Decision with Big Data and Predictive Analytics 21 hours Audience If you try to make sense out of the data you have access to or want to analyse unstructured data available on the net (like Twitter, Linked in, etc...) this course is for you. It is mostly aimed at decision makers and people who need to choose what data is worth collecting and what is worth analyzing. It is not aimed at people configuring the solution, those people will benefit from the big picture though. Delivery Mode During the course delegates will be presented with working examples of mostly open source technologies. Short lectures will be followed by presentation and simple exercises by the participants Content and Software used All software used is updated each time the course is run so we check the newest versions possible. It covers the process from obtaining, formatting, processing and analysing the data, to explain how to automate decision making process with machine learning.
TalendDI Talend Open Studio for Data Integration 28 hours Talend Open Studio for Data Integration is an open-source data integration product used to combine, convert and update data in various locations across a business. In this instructor-led, live training, participants will learn how to use the Talend ETL tool to carry out data transformation, data extraction, and connectivity with Hadoop, Hive, and Pig.   By the end of this training, participants will be able to Explain the concepts behind ETL (Extract, Transform, Load) and propagation Define ETL methods and ETL tools to connect with Hadoop Efficiently amass, retrieve, digest, consume, transform and shape big data in accordance to business requirements Upload to and extract large records from Hadoop, Hive, and NoSQL databases Audience Business intelligence professionals Project managers Database professionals SQL Developers ETL Developers Solution architects Data architects Data warehousing professionals System administrators and integrators Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
dataminr Data Mining with R 14 hours R is an open-source free programming language for statistical computing, data analysis, and graphics. R is used by a growing number of managers and data analysts inside corporations and academia. R has a wide variety of packages for data mining.
PentahoDI Pentaho Data Integration Fundamentals 21 hours Pentaho Data Integration is an open-source data integration tool for defining jobs and data transformations. In this instructor-led, live training, participants will learn how to use Pentaho Data Integration's powerful ETL capabilities and rich GUI to manage an entire big data lifecycle, maximizing the value of data to the organization. By the end of this training, participants will be able to: Create, preview, and run basic data transformations containing steps and hops Configure and secure the Pentaho Enterprise Repository Harness disparate sources of data and generate a single, unified version of the truth in an analytics-ready format. Provide results to third-part applications for further processing Audience Data Analyst ETL developers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
psr Introduction to Recommendation Systems 7 hours Audience Marketing department employees, IT strategists and other people involved in decisions related to the design and implementation of recommender systems. Format Short theoretical background follow by analysing working examples and short, simple exercises.
pmml Predictive Models with PMML 7 hours The course is created to scientific, developers, analysts or any other people who want to standardize or exchange their models with Predictive Model Markup Language (PMML) file format.
datashrinkgov Data Shrinkage for Government 14 hours
matlab2 MATLAB Fundamentals 21 hours This three-day course provides a comprehensive introduction to the MATLAB technical computing environment. The course is intended for beginning users and those looking for a review. No prior programming experience or knowledge of MATLAB is assumed. Themes of data analysis, visualization, modeling, and programming are explored throughout the course. Topics include:     Working with the MATLAB user interface     Entering commands and creating variables     Analyzing vectors and matrices     Visualizing vector and matrix data     Working with data files     Working with data types     Automating commands with scripts     Writing programs with logic and flow control     Writing functions
datavis1 Data Visualization 28 hours This course is intended for engineers and decision makers working in data mining and knoweldge discovery. You will learn how to create effective plots and ways to present and represent your data in a way that will appeal to the decision makers and help them to understand hidden information.
dsbda Data Science for Big Data Analytics 35 hours Big data is data sets that are so voluminous and complex that traditional data processing application software are inadequate to deal with them. Big data challenges include capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating and information privacy.
neo4j Beyond the relational database: neo4j 21 hours Relational, table-based databases such as Oracle and MySQL have long been the standard for organizing and storing data. However, the growing size and fluidity of data have made it difficult for these traditional systems to efficiently execute highly complex queries on the data. Imagine replacing rows-and-columns-based data storage with object-based data storage, whereby entities (e.g., a person) could be stored as data nodes, then easily queried on the basis of their vast, multi-linear relationship with other nodes. And imagine querying these connections and their associated objects and properties using a compact syntax, up to 20 times lighter than SQL. This is what graph databases, such as neo4j offer. In this hands-on course, we will set up a live project and put into practice the skills to model, manage and access your data. We contrast and compare graph databases with SQL-based databases as well as other NoSQL databases and clarify when and where it makes sense to implement each within your infrastructure. Audience Database administrators (DBAs) Data analysts Developers System Administrators DevOps engineers Business Analysts CTOs CIOs Format of the course Heavy emphasis on hands-on practice. Most of the concepts are learned through samples, exercises and hands-on development.

Upcoming Courses

Other regions

Weekend Data Mining courses, Evening Data Mining training, Data Mining boot camp, Data Mining instructor-led , Evening Data Mining courses, Data Mining training courses, Data Mining trainer ,Weekend Data Mining training, Data Mining on-site, Data Mining private courses, Data Mining classes, Data Mining instructor, Data Mining one on one training

Course Discounts

Course Venue Course Date Course Price [Remote / Classroom]
MySQL Database Administration Atlanta, GA - One Hartsfield Thu, Mar 1 2018, 9:30 am $2570 / N/A
Test Automation with Selenium CO, Denver - Denver Place Mon, Mar 12 2018, 9:30 am $3445 / $4995
Solr for Developers VA, Stafford - Quantico Corporate Mon, Mar 19 2018, 9:30 am $4455 / $5855
Excel VBA Introduction New York (NYC) - Midtown Manhattan - Park Avenue & E48-49th (Grand Central) Mon, Mar 26 2018, 9:30 am $2430 / $4130
Introduction to R New York (NYC) - Midtown Manhattan - Madison & E38-39th Wed, Mar 28 2018, 9:30 am N/A / $5800
JMeter Fundamentals and JMeter Advanced New York (NYC) - Midtown Manhattan - Park Avenue & E48-49th (Grand Central) Wed, Mar 28 2018, 9:30 am N/A / $3850
Drupal 8 for Developers New York (NYC) - Midtown Manhattan - Madison & E38-39th Thu, Mar 29 2018, 9:30 am N/A / $3900
Introduction to IoT Using Arduino WI, Milwaukee - Downtown Milwaukee Mon, Apr 2 2018, 9:30 am $2970 / $4330
Administering MediaWiki New York (NYC) - Midtown Manhattan - Park Avenue & E48-49th (Grand Central) Mon, Apr 2 2018, 9:30 am N/A / $2250
IT Automation with Saltstack Remote Course - Eastern Time (UTC-05:00) US & Canada Wed, Apr 4 2018, 9:30 am $2370 / N/A
SQL Fundamentals New York (NYC) - Midtown Manhattan - Park Avenue & E48-49th (Grand Central) Mon, Apr 16 2018, 9:30 am N/A / $3700
Introduction to Selenium New York (NYC) - Midtown Manhattan - Madison & E38-39th Tue, Apr 17 2018, 9:30 am N/A / $2300
Apache Tomcat Administration IL, Chicago - CBD - West Loop Riverside Plaza Center Wed, May 2 2018, 9:30 am $4833 / $6233
Excel VBA Introduction New York (NYC) - Midtown Manhattan - Park Avenue & E48-49th (Grand Central) Thu, May 10 2018, 9:30 am N/A / $3200
Neural Network in R MA, Boston - Federal Street Mon, Jul 2 2018, 9:30 am $3150 / $4390

Course Discounts Newsletter

We respect the privacy of your email address. We will not pass on or sell your address to others.
You can always change your preferences or unsubscribe completely.

Some of our clients

Outlines Extract
Machine-generated

Data mining and analysis and knowledge and processes the service or windows process management and neural store server 4 interfaces what is a multiple open shift and services the business control exp. Statements and company processes components and streaming and implementation of and statistics control in a windows azure the control statements programming in a content display design a. Settings of a service studio measures and services of constructor based applications with the installation, and the role of comparing processes of the so specifications to a business pe. To ease optimized commands in the service processes service streams and module 6: an applications and exercises completes a company hardware sets of a simulink and programming and expert. And design complex methods and model and expressions and configuration to an open-source programming and streaming and the system and controller components of a component of the state of.