Table of Contents
Toggle- CRoss Industry Standard Process for Data Mining – CRISP-DM
- Process Model which is just apt for Data Mining and Analytics Projects
- Six Step Process Model, which is a structured approach in handling Data Science Projects as well as Artificial Intelligence Projects Become a Data Scientist with 360DigiTMG Data Science course in Hyderabad. Get trained by the alumni from IIT, IIM, and ISB.
While all the steps are equally important, let us discuss each step, in further detail.
Get started with the first stage of CRISP-DM:
Step 1: Business Understanding
The Four Key Steps of Business Understanding Phase of CRISP-DM
- Define Business Problem
- Assess and Analyze Scenarios
- Define Data Mining Problem
- Project Plan
1.a. Define Business Problem
Understanding the business problem is extremely pivotal because garbage-in garbage-out. If Data Scientists and/or AI experts fail in this step, then all the subsequent steps will be a futile attempts in solving business problems. Are you looking to become a Data Scientist? Go through 360DigiTMG’s PG Diploma in Data Science and Artificial Intelligence!
Recommended: While it is fair to assume that customers understand their business problem well, one has to ‘don the hat’ of a consultant and perform market research. Research on what are the challenges of the industry which the customer is operating in, do we have a better problem to solve than that proposed by the customer.
Sources which will help you perform very good review:
- Economist Intelligence Unit for International Market
- CEIC – India Premium Database
As soon as one understands the business problem, the next logical move is to record business objectives and constraints aligned with the business problem. It is always good to keep these short, preferably 2 or 3 words. Also, it is advisable to include data optimization terms such as Maximization or Minimization, etc. Want to learn more about data science? Enroll in the Best Data Science courses in Chennai to do so.
Here are a few examples on composing Business Objectives and Business Constraints for the given problem.
A classic Banking Industry business problem solved using Data Analytics.
- Business Problem A: Significant proportion of customers who take loan are unable to repay
- Business Objective: Minimize Loan Defaulters
- Business Constraints: Maximize Profits
Another classic Financial Services Insurance Industry business problem solved using Data Science.
- Business Problem B: Significant proportion of customers are complaining that they did not do the credit card transaction
- Business Objective: Minimize Fraud
- Business Constraints: Maximize Convenience
Agriculture sector business problem solved using Artificial Intelligence.
- Business Problem C: Yield of crop is not improving year on year
- Business Objective: Maximize Yield
- Business Constraints: Minimize Cost
Ecommerce industry business problems solved using Machine Learning.
- Business Problem D: Present Recommendation System is not effective
- Business Objective: Maximize Cross-selling & Up-selling
- Business Constraints: Minimize Coupon Fatigue
Digital Marketing business problem solved using AI.
- Business Problem E: Google Adwords Strategy is not effective
- Business Objective: Maximize Click Through Rate
- Business Constraints: Minimize Cost Per Click
These examples will set a lot of context and tone to move to the next step
Stage 1.b. is explained in the section below:
1.b. ‘Assess and Analyze Scenarios’
This steps typically deals with a lot of project management activities which are required for proceeding further on the project.
Firstly, one needs to know about the As-Is state analysis from the perspective of:
- Data presently available
- Secondary data sources
- Size of data available
- How much data would get generated on a daily basis?
- Various formats in which data is stored
- Human Resources
- All the cross functional human resources
- Experience in Domain, Programming, AI, etc.
- Availability of Human Resources
- Full time employees
- Contract employees and their tenure
- Who are the employees serving notice period, etc.
- Risks
- Political
- Social
- Economical
- Technological
Secondly one has to analyze on what is required in terms of:
- Hardware & Software
- Configuration of computers, servers
- Is data stored on cloud or on premise
- Streaming vs Batch processing of data
- Human Resources
- Chief Data Scientists, Data Security Experts, etc.
- Data Engineers, Data Analysts, Data Scientists, etc.
- Web Application, Mobile Application, UI, UX developers for deployment, etc.
- Record Assumptions & Constraints of each requirement
- All assumptions
- All constraints with respect to
- Time, Cost, Scope, Resources, Risk, Quality
- Verify these assumptions & constraints in light of data available
Next, Perform Risk Management for:
- Timelines
- Human Resources
- Data
- Hardware
- Software
- Financial Aspects
Finally, Documenting and defining success criteria along with ROI is important to measure the project success. This will ensure that every stakeholder is aware of what constitutes a success. Success criteria can be tangible as well as intangible. However, to remove room for ambiguity, one shall define success criteria which is SMART (Specific, Measurable, Achievable, Relevant, Time-Bound). In certain complex projects, instead of a simple ROI, people use NPV, which measures the future value of inflows of revenue in present terms. Earn yourself a promising career in data science by enrolling in the Data Science Classes in Pune offered by 360DigiTMG.
Now let us move on to the next step.
Stage 1.c. is explained in the section below:
1.c. Define Data Mining Problem
It is always required to clearly stalk the Data Mining Problem from the Business problem. While we have done this as part of 1.a., now we formally document the Data Mining problem from Business Problem.
Here are the bunch of points, which should be considered:
- Pre-analysis phase
- Input to this will be Success criteria and business problem along with risks, assumptions & constraints
- Technical discussions with Data Scientists, Data Analysts, Data Engineers, Architects, etc.
- Understand on what ML, Data Mining techniques and algorithms are suitable for the given business problem to be solved
- High level design for end to end solution architecture along with integration into existing customer infrastructure
- Success criteria from Data Science perspective, e.g. no overfitting with accuracy of > 75%. Depends on industry – Social sciences or Medical sciences. Looking forward to becoming a Data Scientist? Check out the Data Science Course and get certified today.
Finally, we arrive at the final sub-module of step 1 of CRISP-DM.
Stage 1.d. is explained in the section below:
1.d. Project Plan
While a project plan may contain multiple components, our focus should be on the following key components:
- High Level Timelines
- Allocated Human resources
- Allocated Hardware and Software
- Risks and Risk Response plans
- High Level Deliverables along with Success Criteria for each of 6 phases of CRISP-DM
- Highlight One-time activities and Iterative activities pictorially
Now we should formally close the phase I of the project and phase end is typically called Gate end, Kill Gate, Phase end, etc.
Formal Closure of Stage 1 of CRISP-DM
Phase Gate Check Points include the following:
- Project Charter
- Definition of Business Objectives
- Success Criteria for Business as well as Data Mining
- Cost Allocation and Resource Planning (Hardware as well as Software)
- ML and DM techniques and algorithms to be applied including workflow of Data from
exploration to deployment
- Project plan for all 6 phases of CRISP-DM with timelines and risks identified at each phase
With this we conclude the Phase I or Step 1 of CRISP-DM. In the next module we shall discuss the Data Understanding phase. Do leave your comments for the betterment of subsequent modules. I hope you enjoyed the Journey of learning about Project Management of Data related projects.
Learn more about data scientist course in malaysia
INNODATATICS SDN BHD (1265527-M)
360DigiTMG – Data Science, IR 4.0, AI, Machine Learning Training in Malaysia
Level 16, 1 Sentral, Jalan Stesen Sentral 5, KL Sentral, 50740, Kuala Lumpur, Malaysia.
info@360digitmg.com
+ 601 9383 1378 / + 603 2092 9488
Also, check this Data Science Institute in Bangalore to start a career in Data Science.
Data Science Training Institutes in Other Locations
Tirunelveli, Kothrud, Ahmedabad, Hebbal, Chengalpattu, Borivali, Udaipur, Trichur, Tiruchchirappalli, Srinagar, Ludhiana, Shimoga, Shimla, Siliguri, Rourkela, Roorkee, Pondicherry, Rajkot, Ranchi, Rohtak, Pimpri, Moradabad, Mohali, Meerut, Madurai, Kolhapur, Khammam, Jodhpur, Jamshedpur, Jammu, Jalandhar, Jabalpur, Gandhinagar, Ghaziabad, Gorakhpur, Gwalior, Ernakulam, Erode, Durgapur, Dombivli, Dehradun, Cochin, Bhubaneswar, Bhopal, Anantapur, Anand, Amritsar, Agra , Kharadi, Calicut, Yelahanka, Salem, Thane, Andhra Pradesh, Greater Warangal, Kompally, Mumbai, Anna Nagar, ECIL, Guduvanchery, Kalaburagi, Porur, Chromepet, Kochi, Kolkata, Indore, Navi Mumbai, Raipur, Coimbatore, Bhilai, Dilsukhnagar, Thoraipakkam, Uppal, Vijayawada, Vizag, Gurgaon, Bangalore, Surat, Kanpur, Chennai, Aurangabad, Hoodi,Noida, Trichy, Mangalore, Mysore, Delhi NCR, Chandigarh, Guwahati, Guntur, Varanasi, Faridabad, Thiruvananthapuram, Nashik, Patna, Lucknow, Nagpur, Vadodara, Jaipur, Hyderabad, Pune, Kalyan.
Data Analyst Courses In Other Locations
Tirunelveli, Kothrud, Ahmedabad, Chengalpattu, Borivali, Udaipur, Trichur, Tiruchchirappalli, Srinagar, Ludhiana, Shimoga, Shimla, Siliguri, Rourkela, Roorkee, Pondicherry, Rohtak, Ranchi, Rajkot, Pimpri, Moradabad, Mohali, Meerut, Madurai, Kolhapur, Khammam, Jodhpur, Jamshedpur, Jammu, Jalandhar, Jabalpur, Gwalior, Gorakhpur, Ghaziabad, Gandhinagar, Erode, Ernakulam, Durgapur, Dombivli, Dehradun, Bhubaneswar, Cochin, Bhopal, Anantapur, Anand, Amritsar, Agra, Kharadi, Calicut, Yelahanka, Salem, Thane, Andhra Pradesh, Warangal, Kompally, Mumbai, Anna Nagar, Dilsukhnagar, ECIL, Chromepet, Thoraipakkam, Uppal, Bhilai, Guduvanchery, Indore, Kalaburagi, Kochi, Navi Mumbai, Porur, Raipur, Vijayawada, Vizag, Surat, Kanpur, Aurangabad, Trichy, Mangalore, Mysore, Chandigarh, Guwahati, Guntur, Varanasi, Faridabad, Thiruvananthapuram, Nashik, Patna, Lucknow, Nagpur, Vadodara, Jaipur, Hyderabad, Pune, Kalyan, Delhi, Kolkata, Noida, Chennai, Bangalore, Gurgaon, Coimbatore.
Navigate To:
360DigiTMG – Data Science, Data Scientist Course Training in Bangalore
Address - No 23, 2nd Floor, 9th Main Rd, 22nd Cross Rd, 7th Sector, HSR Layout, Bangalore, Karnataka 560102
Phone: 1800-212-654321
Email: enquiry@360digitmg.com
Get Direction: Data Science Course in Bangalore
Source link : What are the Best IT Companies in Mangalore
Source link : The Many Reasons to Pursue a Career in Data Science: Unleashing the Power of Data