Libram Mea: March 2019

Monday, 18 March 2019

Diversity and Self-Representation

The question of how people represent themselves online has been fraught with complexity since the earliest days of social media. I have been intrigued by this for a long time, especially after living in parts of the world which use modified versions of the standard roman alphabet – or a completely different one altogether – and still navigating the technology and online spaces. I want to now present my thoughts and experience about digital diversity, as part of #23things.

Self-identification is an issue which people need to address in their earliest days of internet access, like what should be your first email address, and user names for different online fora. I for example was so boring that my first email address was the suggested gordon_douglas@hotmail.com. When the time came to replace it, the first idea that came to mind was to represent that I was born in Canberra. This has the added benefit of making it easier for other people to remember, after I give it out.

Curiously, communities form even at this early stage: I remember a friend – whose address was an numbered variant of doorhandle_@hotmail – being approached by other people with doorhandle_ addresses, organising out of spite for whoever had claimed the original doorhandle of hotmail. This included “hacktivists” from vastly different areas, including him in suburban Brisbane.

Social media and games further allowed the potential to customise a virtual character so that it appeared nothing like you. You could be part of the Orcish Horde in World of Warcraft, or one of various alien races in Mass Effect, and no one would really care about your real-world demographics like gender, height, or ethnicity. Virtual worlds like Second Life took character customisation to its extreme, allowing people to express themselves as having giraffe heads walking on springs instead of legs.

Real-world demographics do penetrate these virtual worlds, however. English is easily the dominant language of Internet users, with Chinese as a strong second [I suspect that this is only modern simplified Mandarin, but may also include Cantonese and Taiwanese]. Russian is only meant to account for 109M users globally, but seems to be very active in a number of specific communities. These include several of my favourite computer games, as well as Bitcoin mining. Chinese, Japanese, and Korean Internet subcultures have also developed over the past decade, with a stereotypical link between Koreans and Star Craft 2 for example, and other games being released explicitly based on Chinese and Japanese history and mythology, to counter the traditional European default.

What this means is that language barriers still arise. How do you join or join a conversation with someone you met online, but whose name you can’t read?

Indeed, default Internet users are still assumed to be people like myself: younger, Caucasoid, Anglophone, able-bodied, and male. Even with evidence that females gamers make a growing portion of the market, and that 50% of all internet users are from Asia – not to mention growing numbers of people with disabilities – old habits die hard. The major producers still see North America as their main market, and so continue to populate their characters with Caucasian and African Americans.

Robertson, Magdy, and Goldwater, in their 2018 study on use of skin tones in emojis, found that the lightest skin tones were the most popular in personal communications, particularly in Asia. This may have a relationship to how anime traditionally depicts otherwise distinctively Japanese ways of life, but with characters of Caucasoid appearance. They noted however that people with regular and reliable internet access are more likely to be of lighter skin-tones themselves, and so are more likely to be digitally active and visible. In North America, with 94% internet penetration, the use of various skin-tones is markedly more equal than in other parts of the world, further reinforcing the existing relationships between ethnicity, poverty, and access to technology.

The above study was based on the difference between uses of digital icons where the skin-tone could/not be changed. The default is still a bright yellow, such as in the Simpsons or the typical smiley-face stickers. There is still a great case to be made for virtual escape-zones, where the characters or avatars are so far removed from real-world ethnicities as to be meaningless, in terms of working backwards.

Online spaces were not intended to be political, and indeed, there is fierce backlash against “political correctness gone mad” and “social justice warriors” taking over. Real-world politics has also become fixated with denouncing “identity politics”, and yet, what is politics if not about identity? We commonly describe ourselves to others by our occupation, which carries a range of demographic assumptions. We are encouraged - at least in traditional political institutions - to self-identify by where we live, but we will curiously tell people which sports team we support freely, while information like ethnicity or country of birth has to be actively sought. People have always been diverse, and it should be no surprise that their diversity carries over to the internet.

Thursday, 14 March 2019

Online Security

Online security is not something to take lightly. Fortunately, I have been aware of the risks for a long time. The one foremost in my mind is the risk of identity theft: someone uses my name, picture, and a few details to impersonate me. I can be used as a vector to spread a digital virus, or even have my false identity used to commit serious crimes [theft, fraud, hate crimes, and so on]. I also remember watching Sandra Bullock in the Net [1995], as a computer programmer who puts her entire life on the then-nascent internet, and puts her life in danger.

Not taking care of your online security can potentially put other people at risk, too. In this experiment, some ABC journalists shared the digital footprint left behind by their mobile phones over a weekend. In the case of one young man, the wider public were then able to correctly guess when he was moving house, where that new house was, the mobile phone number of his living-in significant other, and that of his mother!

While that kind of revelation is easily shocking, I’m not as overly protective about my data as some people. For example, if I want a mapping application to show me directions around an unfamiliar location, then my mobile will need to transmit data about my location to its server. Given that I have also worked in freedom of information, I am also aware of privacy law, and the regulations around it. So here are a few handy lessons which I’ve learned.

Artificial intelligence is still really stupid. Despite how much data I have given to Facebook, over more than ten years, I have had to literally LOL at some of the recommendations it’s made to me. Even while I mostly access it on my mobile, with the aforementioned location signal broadcast, it still seems to have a vague idea of where I am [to within ~100 km]. Google Maps and WhereIs – with all the data in the white pages, no less – sometimes can’t even find a street address where I have been before. Just as ridiculous are the suggested contacts provided by LinkedIn, including people I have never met, from completely different industries and countries, with no common connections or interests. Then of course there is the controversy over how easy it is for a picture to be banned for “offensive content” which include “female-presenting nipples”, but not images of rape or torture. If the machine is so easily fooled, I don’t believe that I am at risk.
Mess with the machines. Perhaps one reason why Facebook and Google are so easily fooled is that I lie to them sometimes. Facebook and Google+ are broadcast media; your profile is available indiscriminately to basically anyone and everyone. I do have real pictures of myself on my profile, but my main profile picture is not one of them. Images which I have used previously include:
1. Gordon Freeman, from video-game series Half Life;
2. Gordon Tracy A.K.A. #6, from the Thunderbirds T.V. programme;
3. Coran, from the new Voltron T.V. programme;
4. Gordon Brown, former British Prime Minister;
5. A random mural on a wall in inner Melbourne.

What all of them have in common is that they mostly look like me, but are images cleverly sourced from elsewhere on the internet. [I also did this for a joke about how Gordon is a relatively uncommon name.]

Additionally, I don’t put my home address into a navigation app. when I’m going home from somewhere unfamiliar. Instead, I’d use a nearby landmark, and I would never put it on a publicly-available digital profile.

Because my main image and location are not the first things immediately broadcast about myself, putting my name into a search engine doesn’t find me, which brings me to the next point.

Be a small target. Who would want to be me? Specifically, of all the billions of identities available to steal – and infinitely more which you could just make up – why would you choose to be a mature-aged Caucasoid student in Australia. Writing “Gordon Douglas + Australia” into Google turns up at least a dozen separate identities on the first three pages. These include an ANZAC, a corporate Director, a criminal, a visual artist, and even dead writers and filmmakers. The Director, artists, and the news website who reported on the cases of child abuse all have a vested interest in ensuring that their search results rank highly; I do not [yet]. If you wanted to cheat people out of serious money, wouldn’t you try pretending to be them instead? Since most giant search engines are for-profit causes, they can easily be manipulated by money. That brings me to my next point.
Remember legal responsibilities. Despite what Facebook might say, Australians legally own their personal data. This is the foundation of our privacy law, which they and all others must obey. If you believe that your data is being misused, or inadequately protected, there are State and Federal Government agencies responsible for enforcing your rights. At the extreme end, someone who successfully steals your identity could be sued, and even imprisoned for fraud. I can also tell you that the Victorian Information Commissioner would put more time and effort into scrutinising your complaints than Facebook would.

Related to this, some credit should be given to digital giants now having internal complaints and review procedures, which they didn’t have at the start. Anyone dealing with big volumes of personal data now realises that they need multiple checks and balances in place, and it’s best when they are automatic. My bank doesn’t know my PIN, or my online password, and doesn’t even track my keystrokes when I’m entering that in, just in case. When dealing with Centrelink or the Tax Office, you have to enter a digital password, then wait for another code sent to the email address or mobile phone number you’ve previously provided, making it effectively a three-stage identity screen, which no one at the client organisations can access.
On that note, it also helps if I don't know my own online passwords. That might not make any sense, so hear me out: I was introduced to the 1Password application, which creates randomised passwords, of any length. I generally make them ~36 characters long, which means that I can't even remember them. The probability of someone being able to guess this string – keeping in mind that lower- and upper-case letters are recognised as different characters – is extremely reassuring. If you really wanted to, you could then change to a different, long, random, complex password every so often.
Analogue data is sometimes best. I still remember when 9MSN messenger chat windows all had a helpful tip written down the bottom: “never give out your passwords or credit number in a chat message”. Part of this is that records of your online chats are easy to save for later use; they were never intended for secure communications. When a friend writes a chat message asking me to lend them some money, for example, I ask them to give me a phone call to pass on their bank account details for a transfer, which I then write down on a piece of paper, so that I can tear it up and bury it once I’m done. [Either that, or we arrange a time and place to physically hand over the money.] It would be much more effort to hack every stage of that process, just to take the small amount of money available, than to hack into a single, random chat session. Similarly, at a business like a legal firm or insurer, personal information is destroyed once it is used for its intended purpose. That means that document shredding and recycling has become a serious business.
The medium is the message. This is a quote from Andy Warhol which I enjoy using. I also like a quote from Chris Rock, saying “you won’t break the law if you just use common sense” [which is a lot like the anti-piracy advertising, “you wouldn’t steal a car…”]. Likewise, your digital footprint won’t be compromised if you don’t put details online that you wouldn’t want someone overhearing in a casual conversation.

Following the earlier example of Will Ockenden from the ABC, you would likely tell your co-worker’s that you were about to move on the weekend. You probably would not tell them the origin and destination, and what time you planned to leave, or which vehicle you would be using to transport all your stuff.
Sending your data on an encrypted website is like sending it in an armoured truck. You could, of course, be suspicious of the people in the armoured truck. In that case, you still have analogue options, as above.

Friday, 8 March 2019

Autobots

You might have already heard before that “librarians are the original search engine”. I certainly have, at least twice since beginning this degree. It got me thinking though about the trend of automating and digitising key functions.

Thinking of my own habits, I much prefer to look up a library's catalogue and database and placing a reservation for what I want, instead of asking for it at a librarian's desk. In archiving, I also preferred to work from a list collated from emails, instead of having a steady drip-feed of notes and phone calls through the day. To me, it's a perfect example of technology use making work more efficiently, or working “smarter, not harder”. It also reminds me of the words of Andy Warhol, that “the medium is the message”; start with how you want someone to react, based on how the message is received.

Technology has its limits though. Just this morning, one of the uni researchers got a voicemail, even though the original phone call hadn't gotten to his phone. How many times have you been caught in the automatic checkout of a supermarket, when the scales failed to register something in your baggage area? You still have to employ someone to stand around to help twelve customers, instead of twelve people serving one customer at a time. I do still like having someone available at the library desk for when things go wrong. Then there are simple questions about the technology, like which WiFi account to log on to, or how to use the printer. (You might not believe it, but the Magistrates Court, County Court, and Supreme Court of Victoria have very different systems for how to access cases and copy the files.)

You might think this is a Luddite rant about how machines will never be as good as people, because we would say that in this profession, wouldn't we? People have our limitations too, of course; the biggest being a failure of imagination. Do you remember for example the Australian census of 2016, with the I. B. M. database and website crashing on the first day? Or when Foxtel was hyping the exclusive release of Game of Thrones season seven, but the system engineers didn't adequately test a massed rush, leading their own customers to download pirated copies of the first episode. People are fundamentally unpredictable, and digital systems don't cope with the unpredictable.

Those episodes to me teach a few lessons. While A. I. & “machine learning” are major buzzwords right now, they won't be programmed to predict consumer behaviour any time soon, because people are too random. I'm reminded of Tony Attwood's quip that autistic children prefer using computers to human teachers, “because computers don't get P. M. T. or hangovers”. People have their place alongside machines, because their exact purpose is to assist us. In terms of my own career, archiving is a good bet, because it is exactly that combination of thinking which files are useful, how it should be accessed, who will make use of it, and directing them to related useful material.

Commonwealth of Data

One of the funny things that's come up in my readings is the Anglo-American Cataloguing Rules (A. A. C. R.). The idea behind it was that people should be able to read and decode any catalogue/library entry, by use of a common format. The fact of the existence of these standards might sound dry as a bone to you (and it was to me too), except that I learner a few cool facts, which got me thinking.

As well as providing a standard format, it also standardised features of the language. From my training as a TESOL teacher, I know that the difference between British English and American English is as different as say French and Quebecois. These rules standardised which spelling is used for what, and which punctuation is used.

Then there are the countries where it applies. Originally of course,it was a collaboration between the British Library and the Library of Congress, based on initial standards from the British Museum. Canada naturally joined in too, and so did Australia, and (West) Germany. To my knowledge, Ireland and New Zealand didn't adopt these rules.

These rules were also brought into line with the International Standard Bibliographic Description (I. S. B. D.), published in 2007. These rules and standards were brought about by collaboration instead of regulation, which could be why they are not widely used outside of the concerned profession. The organised network which drew these up recognises wider applications for information transfer and protection (copyright).

With my previous work in market research, I also came across standards for statistical classifications between Australia and New Zealand. Given that, I wonder if formulation and implementation of rules like this should be a role for the Commonwealth of Nations. The Commonwealth is a natural place to co-ordinate and harmonise information transfer, short of a authority like UNESCO. We already have applications that can link libraries from Australia to Uruguay, as I discovered using public libraries in Melbourne. Using intergovernmental panels could not generate extra goodwill and cohesion amongst their member nations, and strengthen professional networks. It would also make institutions like the Commonwealth more relevant.

Sunday, 3 March 2019

What's in the name?

Thanks for checking out my new 'blog, which I intend to use as a personal catalogue of my investigation of information management. It's a surprisingly broad area, which was a bit of a surprise to me when I first started looking at it last year. People originally called it "library science", which I had never thought of as a career path before. After all, there's the old saying that if you love books, don't work in a library, because you'll never get to read them.

The other possibilities are quite broad: I previously worked in an I.T. company selling equipment for 'big data' storage and processing, so that's not something which is going away. Public catalogues and archives are used by all sorts of people, for all sorts of things. I even used the earliest maps of Ipswich from Queensland's State Archives to make Christmas presents last year.

To introduce you to my thinking, then, I want to go through my surprisingly inspired thinking for the name of this blog.
First of all, you might notice that it's a Latin pun: Libera mea, "liberate me".

When I was reading and writing for I.T. blogs, I thought a bit about why people still refer to "libraries" for data. This word itself also has a Latin origin, ultimately from liber, the inner bark of trees [paper]. Because the only people concerned with books for a long time in my cultural/historical context were the Catholic Church, the word libraries stuck, replacing previous English words like "book-hoard" or "book-house". Fun fact: the word libricide was also briefly in use, for the destruction of books, and knowledge generally.

[Side benefit, I get to say that I needed to play a computer game for uni. work. Hopefully, that tradition holds up!]

If you ask me, catalogue [for those of us who aren't American] is a much better word for handling records of various types. It comes from Greek: variously translated as "a reckoning list", "a complete count", or a "record of words". Then again, describing myself as a "cataloguing student" would be even more boring than library/data scientist, or data/information management.

Why libram? It turns out not to be a real word, in an academic sense. It was invented as a deliberately arcane word for a heavy book, since being popularised by media like Dungeons & Dragons and World of Warcraft. It struck me as being very appropriate then, to describe a digital/virtual record; in a literal sense, it is not a real, physical object.

I didn't have much interest in blogging honestly before coming up with this title. Here's hoping it's enjoyable for each of us, as we delve into Libram Mea in Terra Australis, my digital rants and data-dumps.