Tips Specific to Data Analysis for Cultural Orgs

From the ever-lovely Colleen Dilenschneider:

https://www.colleendilen.com/2018/01/03/three-common-misunderstandings-data-held-cultural-executives/

 

https://www.colleendilen.com/2018/01/10/mets-admission-price-will-not-hurt-accessibility-may-help-data/

Advertisements

List of Data Analysis Techniques

*13. For Your Projects, What Are The Primary Methods You Or Your Company Utilizes (Data science and related projects)

A/B Testing
Anomaly Detection
Attribution Modeling
Classification
Clustering
Cross-Validation
Decision Trees
Deep Learning
Ensemble Methods
Game Theory
KNN (Nearest Neighbor)
Machine Vision
Model Fitting
Monte-Carlo Simulation
Neural Networks
NLP
Pattern Recognition
PCA or Dim. Reduction
Recommendation Systems
Regression
Supervised Learning
Support Vector Machines
Survival Analysis
Time Series
Unsupervised Learning
Voice Recognition

Machine Learning Aspirations

Machine learning:

Train your algorithm: input all stats you have on donors/ticketbuyers. The more data it has, the more accurate the algorithm will be able to predict whether someone will donate and how much basednon traits they share in common w high performers put thru the alg.
For example: zip=x age=y education level=z
Algorithm is
2.5x + .05y + .11z = amount donated
Where the numbers are what the machine learning figures out over time by “feeding” it the answers in your db — probabilities.
Person 1 has x=’90210′ y=35 and z =’grad school’ and has donated a total of $500 to ybca.
Machine learning will take that data and say that all people who come tru the algorithm w same stats will give same amount. But the more data it has the complex it will get.
I can see us making algorithms for donations, memberships, ticket purchase, attendance.

Coursera Data Math Skills

Tangent line is a line that intersects a line at just one point.

The slope of this tangent line is the instantaneous rate of change.

Say you have a line that shows how much revenue comes it (y) at each price point for this product (x=a). Take a point on that line, how fast is f(x) changing at x=a. So what is the velocity of the line at a given point — or instantaneous rate of change or derivative. You can use this to determine, for example, how much your revenue will increase or decrease the more you increase the price of your product. The formula is:

f ‘(a) = lim h->0
aka the derivative of the function f at x=a
aka the slope
aka by how much the value of f(x) is increasing at the point a

 

Couple Quick Links

https://stackoverflow.com/questions/612231/how-can-i-select-rows-with-maxcolumn-value-distinct-by-another-column-in-sql

–finally the perfect way to filter a SQL select based on the values in another column

 

http://www.storytellingwithdata.com/blog/2017/8/2/axis-vs-data-labels

In general, when you are deciding whether to show the axis, label the data directly, or some combination of these things: consider how you want your audience to read the graph. What level of specificity do they need to have with the individual data points? Where do you want them to pay attention? Let the answers to these questions guide your thoughtful design.

Audience Demographics – Intersectionality

My colleague Kevin wrote a great blogpost on getting community feedback on the structure of demographics surveys. How to keep your survey relevant amidst changing trends in how people self-identify:

https://blog.americansforthearts.org/2017/06/13/audience-demographics-the-complexities-of-intersectionality

Bottom line: give people the ability to check multiple boxes and invite them to write in unrepresented responses. Then read those write-in answers and find ways to improve your checkbox selection for next time.

Related reading: https://medium.com/shenomads/respectful-collection-of-demographic-data-56de9fcb80e2