Open Education Handbook/Types of Open Data

There are many different types of data that can be relevant to education and come from education. Relevant sources might include:

  • Publications & literature: ACM, PubMed, DBLP (L3S), OpenLibrary
  • Domain-specific knowledge & resources: Bioportal for Life Sciences,
  • historic artefacts in Europeana, Geonames
  • Cross-domain knowledge: Wikidata, DBpedia, Freebase, ...
  • (Social) media resource metadata: BBC, Flickr, Wikimedia Commons, ...

Explicitly educational datasets and schemas include:

There are also many different ways to categorise this data.

  • Student data: attendance, grades, skills, exams, homework
  • Course data: employability related to courses, curriculum, syllabus, VLE data, number of textbooks, skills, digital literacy…
  • Institution data: location data, success/failure rates, results, infrastructure, power consumption, location, student enrolment, textbook budget, teacher names and contracts, drop out rates, total cost of ownership, sponsorship, cost per pupil, graduation rates, male vs female, years in education, ratio of students to teaching staff
  • User-generated data: learning analytics, assessments, performance data, job placements, laptop data, time on tasks, use of different programmes/apps, web site data
  • Policy/Government data: equity, budgets, spending, UNESCO literacy data, deprivation and marginalisation in education, participation

In addition to information about open licensing, a more detailed description of an open data set may include:

  • Provenance
    • Reference (gov data, geo-data, etc.) - e.g. national curriculum
      • Location of schools, Unis etc
    • Core/Internal (course catalogue, course resources, staff data, buildings, etc.)
    • User-generated/contributed (user activities, assessments, etc.)
  • Granularity
    • individual/personal
    • aggregated/analytics
    • report
  • Descriptiveness
    • data streams (multimedia resources)
    • data content (textual content, database)
    • resource metadata
    • content metadata
    • paradata (as in metadata about data collection)
  • Content
    • Usage/activity data (paradata as in the learning analytics definition)
    • student personal info
    • student profiles (interest, demographics, etc.)
    • student trajectories
    • curriculum / learning objectives / learning outcomes
    • educational resources (multimedia or not)
    • resources metadata (including library collections, reading lists - see Talis Aspire)
    • assessment/grades
    • institutional performance (e.g., OFSTED, KIS)
    • resource outputs (publication repositories, etc.), research management data (projects and funding, etc.), research data
    • cost and student funding data, budgets and finances
    • Classifications/disciplines/topics (e.g. JACS)

Further resources edit