Web Analytics Made Easy - Statcounter

We need open data in the open source AI definition (or an alternative?)

We need open data in the open source AI definition (or an alternative?)
Photo by Giulia May / Unsplash

....Or how do we stop the impending 'finger pointing shit show'

This is truly a 'note' of Emma's notes', maybe I'll rephrase this later (by the way I use dashes in my writing, not em-dashes, but dashes, its me not AI)

I have attended a lot of sessions on WHY the OSI AI 1.0 definition allows data to be described instead of open; I deeply respect a lot of people who have been involved in that decision - and understand some of the tension (health care data etc), and the proposal to think of it as a stack, rather than one thing - but in a world where AI is literally taking over - having no true accountability for sources it's already becoming a shit show.

Just this morning, I opened my computer to two different examples - both in the open source world (canary in coal mine?). First, the challenge about the origins of a GitHub process chart (shared in CHAOSS #wg-ai-alignment WG)

Microsoft uses plagiarized AI slop flowchart to explain how Github works, removes it after original creator calls it out: ‘Careless, blatantly amateuristic, and lacking any ambition, to put it gently’
The official Introduction to Github page included an AI-generated graphic with the phrase “continvoucly morged” on it, among other mistakes.

Second (and sadly I had to download Threads to see this), finger pointing about origins of a marketing phrases.

Post by @marialeal@vivaldi.net
View on Mastodon

A lot of very smart people have been vocal about open data needing to be part of Open Source AI definition , and I am late to the push there - but I cannot see how this will get better without people (being influenced to) starting to build, share and use systems that have transparent data and attribute sources.

It doesn't even seem (correct me if I am wrong) that people are embracing open source AI as an opportunity to do better by humans; its hard to find a lot of research on that (although there is some), just a hunch and observing the marketing push primarily in service of profit.

We spent so MUCH TIME in open source/open access/open science/open education teaching and advocating for attribution and licensing, because it matters - not because it was a nice thing to do, its because given the opportunity, people will appropriate, misattribute and feel rushed (especially with the push to be fast right now) to hide sources as a defense to accountability.

Or maybe I am missing something.

Subscribe to Emma's open notes

Sign up now to get access to the library of members-only issues.
Jamie Larson
Subscribe
Licensed under CC BY-SA 4.0