The goal is to quickly check in a DataFrame using machine learning (sklearn’s Random Forests) if any column predicts any other column. I’m interested in the question “what relationships exist in my data” – particularly if I’m working in an unknown domain and on new data. I’ve used this on client projects during the discovery phase to learn more about the sort of questions I should ask a client.
This is a very light project at the moment, I think the idea has value, I’m very open to feedback.
Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.