Introduction was pleasant, presenter is funny and well spoken, my impression is that he knows his stuff and is looking to make it easy for us to learn it. He was very clear about his expectations relating to attendance, assignments, and the final exam and gave us a lot of ways to contact him. He loses points1 for using 'gestapo' to describe the exam monitors, printing half a tree of handouts, not using the smart-board (prefers white-boards and has a vast pen collection) and for being relentlessly blokey.
The text is Statistics for Managers Using Microsoft Excel and you need an Excel plugin called PHStat which I will muck around with this evening. I look forward to dumping a really big dataset in it and clicking 'Go.'
Why do Managers need to know aboutSadistics Statistics?
Terminology:
Data sources:
Ways to slice data:
Sampling Methods:
Stuff to read for next week: 2.4 - 2.7 & 3.1 - 3.4
Then we ran away.
1Points will be added/subtracted continuously, current score is +3. The purpose of my points systems is to measure first impressions versus final impressions, stay tuned for the 12 week update :p.
The text is Statistics for Managers Using Microsoft Excel and you need an Excel plugin called PHStat which I will muck around with this evening. I look forward to dumping a really big dataset in it and clicking 'Go.'
Why do Managers need to know about
- Presenting information
- Drawing conclusions about information
- Forecasting information
- Using information to improve things (like picking out problem areas of productivity that are outside normal deviation and resolving them)
Terminology:
- A population (universe) is the collection of units under consideration. eg Entire Australian population
- A sample is a portion of the population selected for analysis. eg MBA students in Australia
- A parameter is a summary measure computed to describe a characteristic of the sample. eg: Female MBA students in Australia
- A statistic is a summary measure computed to describe a characteristic of the sample eg: How many Female MBA students in Australia graduate within 3 years
- A dataset can be both a population and a sample.
Data sources:
- Primary
- Observation - look at it
- Experimentation - poke it
- Survey - ask it questions
- Secondary
- Print - read it
- Electronic - read it some more but with less photocopying
- Categorical (qualitative)
- Nominal (has no logical ranking eg: eye colour)
- Ordinal (ranked eg: likert scale)
- Numerical (quantitative)
- Interval (continuous measure eg: temperature)
- Ratio (discrete count eg: number of people in a room)
Ways to slice data:
- Time-Series Data: values recorded in a meaningful sequence such as days, quarters or years.
- Y (forecast variable) = T x S x C x I
- T = trend (is the response to some facebook photos linked to age?)
- S = seasonal (is more fanfiction written on hiatus?)
- C = cyclical (is porn writing in fandom cycilcal?)
- I = irregular / random (acts of god - or terrorism)
- Cross-Sectional Data: data has no meaningful sequence such as sales figures for multiple companies
Sampling Methods:
- Probability Samples
- Simple Random (lotto!) - simple to use but may not be a good representation
- Systematic (grab every 5th person)
- Stratified (select representatives based on some significant quality of the population eg: gender, nationality, location) - may be time consuming and costly
- Cluster (select clusters based on their representing larger population) - may be more cost-effective, but less efficient
- Non-Probability Samples
- Judgment
- Quota (must ask 1,000 people!)
- Chunk
Stuff to read for next week: 2.4 - 2.7 & 3.1 - 3.4
Then we ran away.
1Points will be added/subtracted continuously, current score is +3. The purpose of my points systems is to measure first impressions versus final impressions, stay tuned for the 12 week update :p.
no subject
Shocked.
And possibly stunned.
no subject
*offer smelling salts*
quibbling
- categorical is not qualitative necessarily, especially if you are talking Likert scales
- ratio - interval data with a meaningful zero, so that the term 'half as many' has real meaning (ie Celsius is not ratio, as 10 degrees isn't half as hot as 20 degrees.
- trend - is there a consistent change over time (as facebook users age, do they put up more photos per month)
- cross sectional data - no meaningful *time based* sequence.
as for data sets - I've been collecting fuelwatch data (as emailed to me daily) for a few years - that would have interesting weekly cycles, possible seasonals, and long-term trends (and deviations that can be linked to world wide events...) if you are in need of something.
also - Excel? eeeewwww. @#$@#% unreliable piece of @@@@@, addin or no addin.
(let me show you this loooovely open source data manipulation program I have over here)
Re: quibbling
Re: quibbling
Thank you for quibbling at me!
- My interpretation of parameter is that it could be gender/age/nationality/income/distance from the beach etc. I have probably expressed it badly by putting an example of a gender rather than saying gender - or have I got the wrong end of this stick?
- categorical is not qualitative necessarily... Likert scales measure the level to which you agree with something don't they? I notice wikipedia says they are regarded as ordered-categorical or interval-level which is interesting. Lecturer presented them as ordered-categorical.
- ratio - interval data with a meaningful zero *sweats* that's a lot more complex than what I was given - so a ratio is a continuous scale for which zero is the lowest value?
- trends I was thinking of responses per month to different ages groups - does that apply?
Datasets - I'm trying to think how to wodge the output from Supernatural_fic in to get something meaningful out...You can totally show me loooovely open source data manipulation programs ;)
Re: quibbling
for the last one, hmm. Because is is *specifically* about time series data, then the term trend refers to the long term change over time. Eg. the price of oil has been trending up over the last 50 years, even though there have been short term drops.
Trend gets used in other contexts as well - age related trends are certainly talked about. Also, when a pattern is seen in the data, but it isn't quite extreme enough to be statistically significant, this is sometimes called a trend in the data.
re supernatural fic output wodging - you are welcome to run it past me to see if you have workable ideas.
as to open source data manipulation programs - see http://cran.r-project.org/
can be run under most systems. Has a learning curve that feels like running into a brick wall repeatedly and then running straight through said brick wall, and in to freedom, but for anyone who has any real coding experience, it should be very quick.
When are you on campus this trimester?
Re: quibbling
I'm at uni 6-9 on Tue/Fri and can be there earlier if there is the possibility of company :)