1. Home /
  2. Interest /
  3. DigiHeights


Category

General Information

Phone: +1 819-727-9977



Likes: 1617

Reviews

Add review



Facebook Blog

DigiHeights 03.05.2021

Common Mistakes by Machine Learning Learners: 1) Adjusted r2 works very well in multiple linear regression when you have unwanted variables? (Hint: adjusted r2 cant be used in most scenario's)... 2) One Hot Encoding is the way to go to convert categorical variables into numerical ones.(Hint: Imagine columns like Country, State, and City. You will end up creating 1000 columns (or more) to represent 3 columns? 3) Shall I do Feature Engineering first vs Train-Test Split first? (Hint: Have you heard Data leakage?) 4) I ran PCA to perform Dimensionality Reduction, but I got poorer results than running without it? (Hint: Are you running it on the entire dataset?) or (Have you checked the VIF score first?) 5) Machine Learning Model is the way to go as far as solving Data Science problems? (Hint:90% of the problems don't require a Machine Learning Model) 6) r2 ranges between 0 and 1(Hint: Can be negative as per formula) 7) Outliers can be detected using univariate boxplots. (Hint: Think about contextual outliers, leverage points) 8) K-Means and Hierarchical Clustering are the only clustering techniques. (Hint: They are not, in fact, both of them have loads of problem and is rarely used in real-life problem) 9) If a column only contains unique values, we should drop them. (Hint: It may hold very useful information, and should not straight away be dropped) 10) We have missing value, let's replace it with Mean/ Median or Mode. (Hint: It's better to drop them rather than imputing in this case) Please feel free to whatsapp : +91-8197279977 to know the solutions or other Data Science tips or if you wish to know on how to make a career transition in Data Science /Artificial Intelligence.