Looking for a Tutor Near You?

Post Learning Requirement »
x

Choose Country Code

x

Direction

x

Ask a Question

x

x
x
x
Hire a Tutor

Buisness Intelligence (Introduction To Data Analysis )

Loading...

Published in: Big Data & Hadoop
4,027 Views

What is data analysis and why do we require it ?

Mishal A / Noida

33 years of teaching experience

Qualification: B.Tech/B.E. (BIT MESRA - 2014)

Teaches: All Subjects, Chemistry, Computer Science, Mathematics, Physics

Contact this Tutor
  1. DATA ANALYSIS, INTERPRETATION AND PRESENTATION
  2. OVERVIEW Qualitative and quantitative Simple quantitative analysis Simple qualitative analysis Tools to support data analysis Theoretical frameworks: grounded theory, distributed cognition, activity theory Presenting the findings: rigorous notations, stories, summaries
  3. WHY DO WE ANALYZE DATA The purpose of analysing data is to obtain usable and useful information. The analysis, irrespective of whether the data is qualitative or quantitative, may: describe and summarise the data identify relationships between variables compare variables identify the difference between variables forecast outcomes
  4. Blind men and an elephant - Indian fable Things aren't always what we think! Six blind men go to observe an elephant. One feels the side and thinks the elephant is like a wall. One feels the tusk and thinks the elephant is a like a spear One touches the squirming trunk and thinks the elephant is like a snake. One feels the knee and thinks the elephant is like a tree. One touches the ear, and thinks the elephant is like a fan. One grasps the tail and thinks it is like a rope. They argue long and loud and though each was partly in the right, all were in the wrong.
  5. SCALES OF MEASUREMENT Many people are confused about what type of analysis to use on a set of data and the relevant forms of pictorial presentation or data display. The decision is based on the scale of measurement of the data. These scales are nominal, ordinal and numerical. A nominal scale is where: the data can be classified into numerical or named categories, and the order in which these categories written or asked is arbitrary. a non- can be An ordinal scale is where: the data can be classified into non-numerical or named categories an inherent order exists among the response categories. Ordinal scales are seen in questions that call for ratings of quality (for example, very good, good, fair, poor, very poor) and agreement (for example, strongly agree, agree, disagree, strongly disagree). A numerical scale is: where numbers represent the possible response categories there is a natural ranking of the categories zero on the scale has meaning there is a quantifiable difference within categories and between consecutive categories.
  6. Common myths Complex analysis and big words impress people. Most people appreciate practical and understandable analyses. Analysis comes at the end after all the data are collected. We think about analysis upfront so that we HAVE the data 'we WANT to analyze, Quantitative analysis is the most accurate type of data analysis. — Some think numbers are more accurate than words but it is the quality of the analysis process that matters. When using a quantitative methodology, you are normally testing theory through the testing of a hypothesis. In qualitative research, you are either exploring the application of a theory or model in a different context or are hoping for a theory or a model to emerge from the data. In other words, although you may have some ideas about your topic, you are also looking for ideas, concepts and attitudes often from experts or practitioners in the field.
  7. Common myths cont... Data have their own meaning — Data must be interpreted Numbers do not speak for themselves Stating limitations to the analysis weakens the evaluation. — All analyses have weaknesses; it is more honest and responsible to acknowledge them. Computer analysis is always easier and better. — It depends upon the size of the data set and personal competencies. For small sets of information, hand tabulatio a be o ee Icie t
  8. 1. Organizing the data • Organize all forms/questionnaires in one place Check for completeness and accuracy Remove those that are incomplete or do not make sense; keep a record of your decisions • Assign a unique identifier to each form/questionnaire
  9. Enter your data • By hand • By computer — Excel (spreadsheet) — Microsoft Access (database mngt) — Quantitative analysis: SPSS (statistical software) — Count (frequencie — Percentage Mean Mode — Median — Range — Standard deviatio — Variance — Ranking — Cross tabulation
  10. Which calculation do I use? what you want to know. Do you want to know how many individuals checked each answer? Do you want the proportion of people who answered in a certain way? It depends upon Frequency Percentage Mean Do you want the average number or average score? Do you want the middle value in a range of values Median or scores? Do you want to show the range in answers or scores? Do you want to compare one group to another? Do you want to report changes 'from pre to post? Do you want to show the degree to which a response varies from the mean? Range Cross tab Change score Standard deviation
  11. 3. Interpreting the information Numbers do not speak for themselves. For example, what does it mean that 55 youth reported a change in behavion Or, 25% of padicipants rated the program a 5 and 75% rated ita 4. What do these numbers mean? Interpretation is the process of attaching meaning to the data.
  12. Interpretation demands fair and careful judgments. Often the same data can be interpreted in different ways. So, it is helpful to involve others or take time to hear how different people interpret the same informatiom Think of ways you might do this...for example, hold a meeting with key stakeholders to discuss the data; ask individual participants what they think
  13. Part of interpreting information is identifying the lessons learned What did you learn? —about the program, about the participants, about the evaluation. — Are there any 'ah-has'? What is new? What was expected? — Were there findings that surprised you? — Are there things you don't understand very well — where further study is needed? We often include recommendations or an action plan. This helps ensure that the results are used.
  14. 4 Discuss limitations Written reports: Be explicit about your limitations Oral reports: Be prepared to discuss limitations Be honest about limitations Know the claims you cannot make — Do not claim causation without a true experimental design — Do not generalize to the population without random sample and quality administration (e.g.
  15. GRAPHICAL REPRESENTATIONS give overview of data Number of errors made 2.5 1.5 10 User 15 20 Internet use o < once a day once a day o once a week 02 or 3 times a wee once a month Number of errors made E 4.5 3.5 0.5 11 User 13 15
  16. o O Interaction profiles of players in online game o 000 0 00 o G%tuns received Log of web page activity /PeopP/... 00 0 0 000 00 * edge 0 0 00 o c•oo 00 0 09 /PeoD e0008 0 /pad++/p... //studerj.. O 98.09-1 00 06%00 00 0 08 909-14.. 000aeo 98.09-16. 98-09-18. 98-09-2
  17. QUALITATIVE ANALYSIS 'Data analysis is the process of bringing order, structure and meaning to the mass of collected data. It is a messy, ambiguous, time- consuming, creative, and fascinating process. It does not proceed in a linear fashion; it is not neat. Qualitative data analysis is a search for general statements about relationships among categories of data." Marshall and Rossman, 1990: 111 Hitchcock and Hughes take this one step further: the ways in which the researcher moves from a description of what is the case to an explanation of why what is the case is the case.' Hitchcock and Hughes 1995:295
  18. Unstructured - are not directed by a script. Rich but not replicable. Structured - are tightly scripted, often like a questionnaire. Replicable but may lack richness. Semi-structured - guided by a script but interesting issues can be explored in more depth. Can provide a good balance between richness and replicability.
  19. Recurring patterns or themes Emergent from data, dependent on observation framework if used Categorizing data - Categorization scheme may be emergent or pre-specified Looking for critical incidents Helps to focus in on key events
  20. TOOLS TO SUPPORT DATA ANALYSIS Spreadsheet - simple to use, basic graphs Statistical packages, e.g. SPSS Qualitative data analysis tools - Categorization and theme-based analysis, e.g. N6 Quantitative analysis of text-based data CAQDAS Networking Project, based at the University of Surrey (http://caqdas.soc.surrey.ac.uk/)
  21. Basing data analysis around theoretical frameworks provides further insight Three such frameworks are: - Grounded Theory - Distributed Cognition - Activity Theory
  22. Aims to derive theory from systematic analysis of data Based on categorization approach (called here 'coding') Three levels Of 'coding' - Open: identify categories - Axial: flesh out and link to subcategories - Selective: form theoretical scheme Researchers are encouraged to draw on own theoretical backgrounds to inform analysis
  23. The people, environment & artefacts are regarded as one cognitive system Used for analyzing collaborative work Focuses on information propagation & transformation Shared systems Joan
  24. Explains human behavior in terms of our practical activity with the world Provides a framework that focuses analysis around the concept of an 'activity' and helps to identify tensions between the different elements of the system Two key models: one outlines what constitutes an 'activity'; one models the mediating role of artifacts
  25. Activity Action Operation Motive - Goal Conditions
  26. Subject Rules Tool Community Transformation Object Process Division of labour Outcome
  27. Only make claims that your data can support The best way to present your findings depends on the audience, the purpose, and the data gathering and analysis undertaken Graphical representations (as discussed above) may be appropriate for presentation Other techniques are: Rigorous notations, e.g. UML - Using stories, e.g. to create scenarios - Summarizing the findings
  28. SUMMARY The data analysis that can be done depends on the data gathering that was done Qualitative and quantitative data may be gathered from any of the three main data gathering approaches Percentages and averages are commonly used in Interaction Design Mean, median and mode are different kinds of average' and can have very different answers for the same set of data Grounded Theory, Distributed Cognition and Activity Theory are theoretical frameworks to support data analysis Presentation of the findings should not overstate the evidence
  29. N/A