Colin No Comment data science, machine learning, Python
Net promoter score (NPS) is a tool often used by companies to gauge the loyalty of their customers and how the customers rate the customers service. When NPS surveys are conducted, it is common to also ask for written feedback from the customers, in the form of a short comment. I was recently reviewing a
Colin No Comment data science, IPython, Python
Sometime in Jupyter notebooks its nice to be able to hide the code cells so all you're left with is the results. A simple way to do this is to paste this code into an input cell and run it. It will hide all the input cells and put a "Show code" button in this
Colin No Comment data science, matplotlib
How do you rotate the x axis tick labels on subplots in matplotlib?
Colin No Comment data science
Copied from: https://www.quora.com/What-is-a-good-explanation-of-Latent-Dirichlet-Allocation by,Edwin-Chen, Data Science Professor at the University of Buenos Aires (UBA) Suppose you have the following set of sentences: I ate a banana and spinach smoothie for breakfast I like to eat broccoli and bananas. Chinchillas and kittens are cute. My sister adopted a kitten yesterday. Look at this cute hamster munching on a
Colin No Comment data science, IPython, Python
Here are some examples of frequency tables in python using the SAS buytest data set. One way tables: Count based Which produces data like this Frequency We can calculate a frequency distribution by dividing by the sum or the values column Which produces data like this Two way tables Count based For two way,
Colin No Comment IPython, Python
I recently wanted to plot a facet grip plot of all the columns in a Pandas dataframe. This functionality is very useful in other statistical packages like (e.g. SAS and R), to do some initial data exploration and can be done in Pandas using the hist() method If all the columns are numerical, but this ignores
Colin No Comment data science
From Ido Green's Blog There is a Yahoo API for getting financial data. Just send a request such as http://finance.yahoo.com/d/quotes.csv?s=GE+PTR+MSFT&f=snd1l1yr The codes are: a Ask a2 Average Daily Volume a5 Ask Size b Bid b2 Ask (Real-time) b3 Bid (Real-time) b4 Book Value b6 Bid Size c Change & Percent Change c1 Change c3 Commission
Colin No Comment Uncategorized
Slides from http://www.cs.ukzn.ac.za/~hughm/dm/content/slides07.pdf Assumptions To introduce classifiers we will assume that ~x is a vector of real-valued numerical input features. There are p of them. The variable we are trying to learn is called the response, Y . Inititially we will assume that Y is a binary class and it will simplify the book-keeping to say
Colin No Comment data science
Best explanation of mahalanobis distance i've ever seen. Courtesy of whuber at stats.stackexchange Here is a scatterplot of some multivariate data (in two dimensions): What can we make of it when the axes are left out? Introduce coordinates that are suggested by the data themselves. The origin will be at the centroid of the points (the
Recent Comments