Inclusive Data Research Skills for Arts and Humanities/Data agencies

Introduction and context edit

The objectives of this section were to:

  1. Co-develop the working concept 'Data Agencies' as an approach to data research methods for marginalised and excluded groups
  2. Creating a self-sustaining community of arts and humanities researchers interested in data and digital research
  3. Co-write a Data Agency toolkit for arts and humanities researchers interested in data and digital research, using Wikibooks as a tool.


Core question is: how do we approach data research methods in ways that are empowering for marginalised and excluded groups?

"Critical science and decolonial theory, when used in combination, can pinpoint the limitations of an AI system and its potential ethical and social ramifications, becoming a "sociotechnical foresight tool" for the development of ethical AI" (Royer, 2020: 22).[1] Data epistemologies: How to move past critique and challenge/compliment scientific disciplines (challenge the dualism between coloniser/colonised to redefine decolonisation).

By infusing a decolonial critical approach into AI, data and socio-technical communities we could establish insights and approaches that better connect research and technology development to established ethical values grounded in decolonial theory. Decolonial theory has its foundations in race, law, feminism, queer theory, and philosophical technology studies and understanding the blind spots and limitations of a particular technology necessitates exposing the power dynamics and political relationships that support its application. This will require the development of new research cultures, as well as original technical research in equality and fairness, including the definition of fairness and its ideals, translation, and privacy.

By encouraging inclusive dialogue in research methodologies we could contribute to the development of a responsible AI and a renewed responsibility to current technologies. To critically engage with the past and present, researchers must try to unlearn colonial reasoning, reinstate norms of living that were previously incompatible with life, and create new forms of political and affective transdisciplinary research communities to address these challenges.

Ontological Turn: Important distinctions between dualism have been adopted in humanities theory, "A diverse body of work known as the “ontological turn” has made important contributions to anthropological theory. In this article, I build on this work to address one of the most important theoretical and political issues haunting contemporary theories of technology: the opposition of the “digital” to the “real.” This fundamentally misrepresents the relationship between the online and offline, in both directions. First, it flies in the face of the myriad ways that the online is real. Second (and just as problematically), it implies that everything physical is real. Work in the ontological turn can help correct this misrepresentation regarding the reality of the digital" (Boellstorff, 2016: 387).[2]

Define how we are using ontology here relating to data and digital -----expand (holbraad, links between decolonial, animism, process phil)

Ontology as a way of acting a reality or realities, performing the process, and constructing and acting those life worlds. What we believe our world is made up of, plays a part in how we interact with it, and our understandings of that world (reality/realities) are shaped also by our actions.

What are Data Agencies?

Refers to agency of humans and non-humans. For example, data has agency itself. Which values shape Data Agencies? [3] Where do they come from? How and why?


Values will be a crucial issue if we consider social environments/entanglements as a project of mutual creation, cooperatively constructed and rebuilt.

How are we defining agency? And how are we prioritizing it (who/what gets priority)? - develop


Where is the agency found?

How does it interact with the machine and human in symbiosis?

Where is the intersection of contact?

Agency embedded in the data though historical processes/extension of the human mind but takes on an agency as the data.

Data and algorithmic code’s encounter with the human but where are the concrete set of connections between humans and machines?

Can we build a system that does not oppose the machine and data/algorithms against the body[4] or being human but is a coevolution and can be decololnised with biases revised?

If so, at which point in the process?

Where are the intersecting value systems that could define an agreed set of 'fairness' values to work with?

Even if we try to adopt pre-existing frameworks or establish prioritization checklists to support research endeavors, a goal of decolonial data agency is to question methodologies and practices, within reason. There will always be more context that can be provided and support research conclusions, but balancing interpretability and explainability with performance and the ability to execute is as fundamental a part of a research process as it is an AI model.

Data epistemologies edit

What are data epistemologies?

  • "Epistemic cultures are cultures of creating and warranting knowledge. This is what the choice of the term 'epistemic' rather than simply 'knowledge' suggests ... [i[t brings into focus the content of the different knowledge-oriented lifeworlds, the different meanings of the empirical... particular ontologies of instruments, specific models of epistemic subjects" (Knorr-Cetina as cited in Lury 2021: 11)
  • "Big Data, coupled with new data analytics, challenges established epistemologies across the sciences, social sciences and humanities, and assesses the extent to which they are engendering paradigm shifts across multiple disciplines. In particular, it critically explores new forms of empiricism that declare ‘the end of theory’, the creation of data-driven rather than knowledge-driven science, and the development of digital humanities and computational social sciences that propose radically different ways to make sense of culture, history, economy and society" (Kitchin 2014: 1[5]).


Gaps between data literacies (individualised), data infrastructures (e.g. university computing, data tools and developers, disciplinary approaches) [6]and critical data thinking[7] (e.g. data feminism, data conscience, critical data literacies)

CARE principles.[8]

Data Epistemolgies map


Examples of data epistemologies edit

Nodocenctrism and paranodality (Mejias 2009[9]; Mejias 2013[10]; Barnes 2020[11])

  • …”the digital network as part of a media economy that reproduces inequality through a hegemonic–yet consensual and pleasurable–culture of participation. To support my thesis, I consider the politics of inclusion and exclusion of the network. In order for something to be relevant or visible within the network it needs to be rendered as a node (a phenomenon I refer to as “nodocentrism”). Thus, digital networks are constituted as totalities by what they include as much as by what they exclude” (Mejias 2013)
  • How do we do paranodality?
    • Engage multiple methods such as Social Network Analysis AND participant observation, action research, ethnography, interviews etc. - Develop for toolkit




The Alternative Epistemologies of Data Activism (Milan & Velden, 2016[12])

  • "Data activism indicates the range of sociotechnical practices that interrogate the fundamental paradigm shift brought about by datafication. Combining Science and Technology Studies with Social Movement Studies, this theoretical article offers a foretaste of a research agenda on data activism. It foregrounds democratic agency vis-à-vis datafication, and unites under the same label ways of affirmative engagement with data (“proactive data activism”, e. g. databased advocacy) and tactics of resistance to massive data collection (“reactive data activism”, e. g. encryption practices), understood as a continuum along which activists position and reposition themselves and their tactics".


Materiality of networks and data (e.g. Starosielski[13])


Multi-methods and/or other methods:

  • Ethnography[15]
  • Auto/biographical approaches[16]
  • Autoethnography/arts and humanities researchers[17]
  • Quantitative/Qualitative[18]
  • Multimodal representations[19]
  • Centering community and worldview, choose methods best for the community, what works best for them[20]
  • Mixed methods and multi methods[21]
  • Theory of change (empowered researcher in the process)[22]
  • Starting from an ethics of care - CARE Principles.[23]
  • Reflexivities of discomfort (Pillow, 2003)[24]
  • Indigenous Data Sovereignty Networks.[25]
  • Participatory Action Research (PAR) - Data Stewardship.[26]

What does data agency look like? Definition edit

Data agencies may be regarded multilayered since they interact with a variety of systems at different times (people, social structures, economies, and technical systems), all of which may be examined using diverse approaches. Data agencies possess agency distinct from humans and exert multidirectional influence, representing varying meanings[27] [28]to individuals in different circumstances and times, influenced by the researchers agency and their reflexivity.These agencies can affect both humans and machines (as actors)[29] and create areas of conflict that may generate prospective changes.

Data agencies are inherently complex, impacting individuals and data at multiple junctures through feedback loops. If we consider the concept of data agencies as a crucial link between the machine and the human, this situates machine learning technologies as being a key analytical tool for the possibility of being able to decode data agencies.

One way to think of data agencies is to consider the humility of the researcher and how the data challenges the researcher. For example, the researcher should be prepared to shift their own thinking and enter into a reciprocal relationship with the data and its agency (as an object of study) in which the researcher is not just trying to explain and interpret the data but allows it to shift you and become a participant within a movement in your own thinking[30]. The agency of the data could lead us to analytical humility and reflexivity while allowing for conceptualisation between the researchers agency and the data's agency as mutual entities inhabiting different worlds (Holbraad et al, 2014).[31]

  • Definitions
    • Data should not stand alone but within the human experience.[32].
    • Data as an extension of the human mind and historical practices of collecting data embedded in the data.
    • Data has agency itself.
    • Where do data agencies fit into the wider structure and where is the data living
    • Multilevel data agencies (not just the researcher, data agencies are intertwined with wider social structures).
    • Data and human re-transformed through entanglements and cant separate human from the data.
    • Encountering the data agency we must realise we may reach the limitations of our own thinking (through alterity/difference)[33].
    • Invent new resources to shift our own ways of thinking and reevaluate your own assumptions and positionality and catergories thinking we are using in research (reflexivity).
    • Consider critiques and difficulties with data already existing in the agency of the data.
    • Representation: inclusion and exclusion in data sets and data agencies.

Would it be possible to decolonise data agencies? edit

Machine learning often superficially scrapes data instead of delving into in-depth analysis------expand.

Issues of alignment and values [34]competing values may prevent stakeholders from defining fairness[35] while developing machine learning systems. Where is the intersection of an agreed-upon idea of 'fairness' between distinct value systems (which may be applied to frameworks)? Consider the concept of sameness (not universalism) but a sameness within/ 'inside' difference (Taussig alterity/mimesis and Greaber).

How then can biases then be revised by decolonial approaches?[36] For example, by animism, and through indigenous languages, meaning and context? Would it be possible to build a decolonial AI?[37]

Decolonial thinking, from an African perspective, challenges the dualism inherent in scientific thought through a relational approach through the use of relational, contextual, and historical approaches (Fanon, 1952)[38]. A decolonial approach could play a crucial role in identifying and addressing biases in machine learning by examining the historical and contextual factors that shape data agencies. This approach could enable a more nuanced understanding of the information being processed, thereby enhancing the accuracy and cultural sensitivity of the outcomes.

Deep learning mechanisms that prioritise calculation in data sets and corpus-gen-AI, what are the issues where can a decolonial position materialise?

Possible entry point: Assessment of base code at the test and evaluation stage may be helpful, considering the potential impact on model performance and efficiency from a decolonial perspective. Additionally, incorporating feedback loops during the training process could further optimise the deep learning mechanisms for improved results and how use cases/biases in data sets could impact protective classes and be revised before impact by the machine.

Ethnography and Large Language Models



How could a decolonial approach account for data agencies and bias in data be developed as a critical approach, and ultimately how to get beyond that?

Can we use “polyvocality[39],” a movement towards the inclusion of multiple perspectives in and on data agencies?

How reliable is the data to obtain the perspectives of different stakeholders and the relationships between them and the data agencies? What theoretical framework do we use to define perspectives and their relations with the underlying data? Understanding polyvocality in machine learning will require a methodology that can accurately capture the diverse perspectives of stakeholders. A combination of qualitative and quantitative research methods may be beneficial in developing a theoretical framework that defines perspectives and their interconnectedness within machine learning and data agencies.

File:Data agencies image


Examples?

  • Data 4 Black Lives (D4BL)[40]
  • Masakhane[41]
  • Data cooperatives
  • Data stewards and stewardships
  • Data trusts
  • Data feminism
  • Machine learning for te reo Māori
  • Indigenous AI[42]
  • AI Intersections database[43]

What are the strengths and weaknesses of these examples? edit

What could some alternatives look like? Especially those that account for people? edit

Challenges to data agency approach edit

Political Economy approach to data agencies: rather than abandon the categories of “subject” and “object” and of “Society” and “Nature,” as suggested by proponents of “the ontological turn,” researchers can compare subject–object transformations and the naturalization of social power relations in the two contexts. In acknowledging the ultimate dependence of modern technology on exchange rates and financial strategies in a globalized economy, we realize that the agency of modern artifacts is also dependent on human subjectivity. In shifting the focus of comparative anthropology from ontology to political economy, we can detect that modern technology is a globalized form of magic (Hornborg, A, 2015: 35).[44]


How do Data agencies relate to Roy Bhaskar's Critical Realism [45]and the Philosophy of Meta-Reality Part II: Agency, Perfectibility, Novelty[46] edit

Would a multi methods approach to data agencies be an appropriate method?



Can a data toolkit be developed to analyse data agencies during research for arts and humanities researchers?

What kinds of things would help us? edit

  • Data epistemologies map - conceptual framework[47]
  • Meta survey of methods
  • Diagnostic tool - what methods and skills are related to what kinds of questions?
  • Person-centred
  • "Data joy" (what does this look like?)[4]
  • Advocacy and participatory approaches and centering community worldview.
  • Choose research methods best for the community and the outputs that you are aiming for in research project.[48]
  • Promote data and the DaRes project as a resource as an agency for the community.
  • Research should not be extractive.
  • Defining concept - data agencies, multi methods
  • Ethnography of large language models/linguistic anthropology

Data Empowerment for Empowerment of Arts and Humanities Research edit

Interdisciplinary work centred on data activism and justice research is key, as well as how to avoid mistranslation and develop a common language for arts and humanities researchers to use.

Add decolonial data methods and data agencies toolkit

Examples:

Part of this will include how to feed into technical and legal frameworks to raise awareness and question the coding[49] and design of data sets.[50] But also how researchers might be empowered by legal, policy, and technological advancements, as well as what else needs to change. Data empowerment could also entail legitimising a non-scientific worldview of data through the arts and humanities' multiple complexities and values.



Citations and Related Readings edit

  1. "The Short Anthropological Guide to the Study of Ethical AI". Montreal AI Ethics Institute. 2020-08-30. Retrieved 2024-01-27.
  2. Boellstorff, Tom (2016-08). "For Whom the Ontology Turns: Theorizing the Digital Real". Current Anthropology. 57 (4): 387–407. doi:10.1086/687362. ISSN 0011-3204. {{cite journal}}: Check date values in: |date= (help)
  3. Graeber, David (2013-06). "It is value that brings universes into being". HAU: Journal of Ethnographic Theory. 3 (2): 219–243. doi:10.14318/hau3.2.012. ISSN 2575-1433. {{cite journal}}: Check date values in: |date= (help)
  4. Mauss, Marcel (1973-02). "Techniques of the body". Economy and Society. 2 (1): 70–88. doi:10.1080/03085147300000003. ISSN 0308-5147. {{cite journal}}: Check date values in: |date= (help)
  5. Kitchin, Rob (2014-04-01). "Big Data, new epistemologies and paradigm shifts". Big Data & Society. 1 (1): 205395171452848. doi:10.1177/2053951714528481. ISSN 2053-9517.
  6. "Bastard Algebra | STS Infrastructures". stsinfrastructures.org. Retrieved 2024-04-09.
  7. Seaver, Nick (2018-08-21). "What Should an Anthropology of Algorithms Do?". Cultural Anthropology. 33 (3): 375–385. doi:10.14506/ca33.3.04. ISSN 1548-1360.
  8. "CARE Principles". Global Indigenous Data Alliance. 2023-01-23. Retrieved 2024-01-26.
  9. Mejiias, U. (2009) The limits of networks as models for organizing the social, New Media and Society. Vol 11(8): 1–18 [[DOI:10.1177/1461444809341392|10.1177/1461444809341392
  10. Mejias, U. (2013) Off the Network: Disrupting the Digital World. Minnesota Press [1]
  11. Barnes, N. (2020). Trace publics as a qualitative critical network tool: Exploring the dark matter in the #MeToo movement. New Media & Society, 22(7), 1305-1319. [2]
  12. Milan, Stefania; Velden, Lonneke van der (2016-12-01). "The Alternative Epistemologies of Data Activism". Digital Culture & Society. 2 (2): 57–74. doi:10.14361/dcs-2016-0205. ISSN 2364-2122.
  13. Starosielski, N. (2015) The Undersea Network. Duke University Press
  14. Boellstorff, Tom; Maurer, Bill; Bell, Genevieve; Gregg, Melissa; Seaver, Nick (2015). Data, Now Bigger and Better!. Prickly Paradigm Press. ISBN 978-0-9842010-6-8.
  15. Ethnography and Virtual Worlds. 2012-09-24. ISBN 978-0-691-14950-9.
  16. Atay, Ahmet (2020-11). "What is Cyber or Digital Autoethnography?". International Review of Qualitative Research. 13 (3): 267–279. doi:10.1177/1940844720934373. ISSN 1940-8447. {{cite journal}}: Check date values in: |date= (help)
  17. online.ucpress.edu https://online.ucpress.edu/joae/article/1/1/43/1586/Contemporary-Autoethnography-Is-Digital. Retrieved 2024-01-27. {{cite web}}: Missing or empty |title= (help)
  18. Dawson, Catherine (2019-07-10), "Digital visual methods", A–Z of Digital Research Methods, Abingdon, Oxon ; New York, NY : Routledge, 2019.: Routledge, pp. 107–113, retrieved 2024-01-27{{citation}}: CS1 maint: location (link)
  19. Jewitt, Carey (2013), "Multimodal Methods for Researching Digital Technologies", The SAGE Handbook of Digital Technology Research, 1 Oliver's Yard, 55 City Road, London EC1Y 1SP United Kingdom: SAGE Publications Ltd, pp. 250–265, retrieved 2024-01-27 {{citation}}: line feed character in |title= at position 43 (help); no-break space character in |place= at position 17 (help)CS1 maint: location (link)
  20. Costanza-Chock, Sasha (2020-03-03). Design Justice. The MIT Press. ISBN 978-0-262-35686-2.
  21. Hesse-Biber, Sharlene Nagy; Johnson, R. Burke, eds. (2015-08-06). "The Oxford Handbook of Multimethod and Mixed Methods Research Inquiry". doi:10.1093/oxfordhb/9780199933624.001.0001. {{cite journal}}: Cite journal requires |journal= (help)
  22. Kubisch, Anne C.; Connell, James P.; Fulbright-Anderson, Karen (2001), "Evaluating Complex Comprehensive Community Initiatives: Theory, Measurement and Analysis", Rebuilding Community, London: Palgrave Macmillan UK, pp. 83–98, ISBN 978-1-349-41111-5, retrieved 2024-01-27
  23. "CARE Principles". Global Indigenous Data Alliance. 2023-01-23. Retrieved 2024-01-26.
  24. Pillow, Wanda (2003-03). "Confession, catharsis, or cure? Rethinking the uses of reflexivity as methodological power in qualitative research". International Journal of Qualitative Studies in Education. 16 (2): 175–196. doi:10.1080/0951839032000060635. ISSN 0951-8398. {{cite journal}}: Check date values in: |date= (help)
  25. "Indigenous Data Sovereignty Networks". Collaboratory for Indigenous Data Governance. 2020-05-19. Retrieved 2024-01-26.
  26. "Participatory data stewardship". www.adalovelaceinstitute.org. Retrieved 2024-01-26.
  27. Kockelman, Paul (2013-12). "The anthropology of an equation: Sieves, spam filters, agentive algorithms, and ontologies of transformation". HAU: Journal of Ethnographic Theory. 3 (3): 33–61. doi:10.14318/hau3.3.003. ISSN 2575-1433. {{cite journal}}: Check date values in: |date= (help)
  28. Maurer, Bill (2013-12). "Transacting ontologies: Kockelman's sieves and a Bayesian anthropology". HAU: Journal of Ethnographic Theory. 3 (3): 63–75. doi:10.14318/hau3.3.004. ISSN 2575-1433. {{cite journal}}: Check date values in: |date= (help)
  29. Pedersen, Morten Axel (2023-01). "Editorial introduction: Towards a machinic anthropology". Big Data & Society. 10 (1): 205395172311538. doi:10.1177/20539517231153803. ISSN 2053-9517. {{cite journal}}: Check date values in: |date= (help)
  30. Kaur, Raminder (2019-09). "The digitalia of everyday life: Multi-situated anthropology of a virtual letter by a "foreign hand"". HAU: Journal of Ethnographic Theory. 9 (2): 299–319. doi:10.1086/705581. ISSN 2575-1433. {{cite journal}}: Check date values in: |date= (help)
  31. "Holbraad, M., & Pedersen, M. A. (2014). The Politics of Ontology. Fieldsights-Theorizing the Contemporary. Cultural Anthropology. - References - Scientific Research Publishing". www.scirp.org. Retrieved 2024-01-27.
  32. Koopman, Colin (2019). How we became our data: a genealogy of the informational person. Chicago, [Illinois] London: The university of Chicago press. ISBN 978-0-226-62644-4.
  33. "Mimesis and Alterity: A Particular History of the Senses". Routledge & CRC Press. Retrieved 2024-04-09.
  34. Christian, Brian (2020-10-06). The Alignment Problem: Machine Learning and Human Values. National Geographic Books. ISBN 978-0-393-63582-9.
  35. Henricks, Kasey (2017-06-01). ""I'm Principled Against Slavery, but …": Colorblindness and the Three-Fifths Debate". Social Problems. 65 (3): 285–304. doi:10.1093/socpro/spx018. ISSN 0037-7791.
  36. Hanna, Alex; Denton, Emily; Smart, Andrew; Smith-Loud, Jamila (2020-01-27). "Towards a critical race methodology in algorithmic fairness". Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. New York, NY, USA: ACM. doi:10.1145/3351095.3372826.
  37. Adams, Rachel (2021-04-03). "Can artificial intelligence be decolonized?". Interdisciplinary Science Reviews. 46 (1–2): 176–197. doi:10.1080/03080188.2020.1840225. ISSN 0308-0188.
  38. Fanon, Frantz (2021-03-25). Black Skin, White Masks.
  39. Thimm, Viola; Chaudhuri, Mayurakshi; Mahler, Sarah J. (2017-06). "Enhancing Intersectional Analyses with Polyvocality: Making and Illustrating the Model". Social Sciences. 6 (2): 37. doi:10.3390/socsci6020037. ISSN 2076-0760. {{cite journal}}: Check date values in: |date= (help)
  40. Data 4 Black Lives, [3]
  41. "Masakhane - Masakhane MT: Decolonise Science". www.masakhane.io. Retrieved 2024-01-26.
  42. "INDIGENOUS AI". INDIGENOUS AI. Retrieved 2024-04-10.
  43. "AI Intersections Database". AI Intersections Database. Retrieved 2024-04-22.
  44. Hornborg, Alf (2015-03). "The political economy of technofetishism". HAU: Journal of Ethnographic Theory. 5 (1): 35–57. doi:10.14318/hau5.1.003. ISSN 2575-1433. {{cite journal}}: Check date values in: |date= (help)
  45. Bhaskar, Roy (2020-03-14). "Critical realism and the ontology of persons". Journal of Critical Realism. 19 (2): 113–120. doi:10.1080/14767430.2020.1734736. ISSN 1476-7430.
  46. Bhaskar, Roy (2002-11-15). "The Philosophy of Meta-Reality: Part II: Agency, Perfectibility, Novelty". Journal of Critical Realism. 1 (1): 67–93. doi:10.1558/jocr.v1i1.67. ISSN 1476-7430.
  47. "Matt Artz's Official Website". Retrieved 2024-04-23.
  48. McCulloch, Gretchen. "Coding Is for Everyone—as Long as You Speak English" (in en-US). Wired. ISSN 1059-1028. https://www.wired.com/story/coding-is-for-everyoneas-long-as-you-speak-english/. 
  49. "Coderspeak". UCL Press. Retrieved 2024-04-22.
  50. Cave, Stephen; Dihal, Kanta (2020-12-01). "The Whiteness of AI". Philosophy & Technology. 33 (4): 685–703. doi:10.1007/s13347-020-00415-6. ISSN 2210-5441.