Stilgherrian is an Australian journalist covering internet policy, cybersecurity, digital surveillance, privacy.

Sep 15, 2017
Published on: DirectorTech
2 min read

Computer algorithms and big data aren’t magic, says Stilgherrian. They simply implement their creator’s view of the world, whether it’s right or wrong. That makes using them a matter of ethics.

Computers are already replacing middle managers. When that happens, organisations become a kind of machine-human hybrid. That’s a worry, according to publisher Tim O’Reilly, founder of O’Reilly Media, the publisher of books from which so many network engineers and internet software developers learned their craft.

“When you think about a system like [ride-hailing services] Uber or Lyft, there are programmers who are the top managers, and then there are their programs, the algorithms who are the middle managers, and then there are humans who are actually workers reporting to the algorithms,” O’Reilly told the Asia Pacific Regional Internet Conference on Operational Technologies (APRICOT) in Ho Chi Minh City, Vietnam, in February.

“When you have that Uber app, it tells you where to go, who to pick up, passengers rate you, your work scores are reported, and you can be dismissed on the basis of the data that’s tracked about you … Uber has increasingly treated their workers as a disposable commodity, rather than figuring out how to empower them,” he said.

“We have algorithms that are increasingly managing our businesses and managing our society. We have to make the right choices to design those algorithms in a way that leads to a better workplace for human beings.”

And, it must be said, a better world for their customers, not just for the shareholders.

Uber’s prices reflect demand, and during massive demand spikes their so-called surge pricing can be several times the base price. Surge pricing is seen during bad weather, special event or celebrations like New Year’s Eve, when public transport systems fail, or even during disasters.

During the 2014 hostage siege at Sydney’s Lindt Cafe, Uber’s algorithm responded dutifully to the demand of people fleeing the CBD, raising prices fourfold and setting the minimum price for any journey at $100 . One potential customer’s quote for the short run from the Sydney CBD to airport was “$145 to $184”.

It was more than three hours before a human finally overruled the computer. Customers were refunded, but the brand damage had already been done.

There have even been allegations that some of Uber’s drivers have been gaming the system to produce fake demand spikes, bumping up the price and therefore their own income. Uber says, though, that they’d soon be detected and sacked. By the algorithms.

Uber is certainly the poster child for algorithmic business, for both good and bad, but it’s not the only one. There’s plenty of competition to be the most innovative, or perhaps the most extreme.

Some supermarkets are introducing digital pricing displays so that prices and be changed quickly and easily, as the BBC reported undr a delightful headline, Why your bananas could soon cost more in the afternoon.

“Marks and Spencer conducted an electronic pricing trial where sandwiches were sold at a discount in the morning to encourage shoppers to buy their lunch early.

While the company isn’t currently planning to do this more widely, it comes as several of the large supermarkets are trialling the idea.

Sainsbury’s says it ran an electronic pricing trial two years ago — but wouldn’t say what conclusions it reached -—while Morrisons and Tesco are each currently trying out the system in one of their stores.”

This kind of demand pricing might provide bargains and increase sales, but it might also lead to price-gouging, and disadvantage customers who might have no choice but to shop at high-demand times.

Centrelink, just following (the computer’s) orders

In mid-2015, Centrelink, the Australian government’s social security agency, started using algorithms to help recover overpayments, such as when someone continued to be paid Newstart benefits or Youth Allowance after they were no longer eligible.

The plan was to cross-match Centrelink’s data with data from the Australian Taxation Office (ATO) to identity discrepancies.

Centrelink sees these overpayments as a debt to be recovered. The system was therefore designed to calculate the amount owing, and begin the debt recovery process. The government claimed the system would recover $4 billion.

The program has been, and continues to be, a shambles.

The algorithms were so poorly conceived that vast numbers of incorrect debt notices were issued. The subsequent procedures for Centrelink clients to dispute the debt were onerous, with tight deadlines, and the amounts claimed were often impossibly high, causing great distress.

Details of the problems and their scale are still being winkled out of the Department of Human Services through freedom of information requests, but here’s just a small sample of the problems.

  • Centrelink tracks income fortnightly, but the ATO works to annual cycles. Income can vary through the year, so ATO data isn’t sufficient to calculate eligibility.
  • Once debt notices were sent, any challenge had to be lodged within 21 days, and it could only be done online.
  • Centrelink was chasing overpayments from up to seven years ago. So a challenge might require employment records going back that far, even though the ATO only requires records to kept for five years. Centrelink itself had told clients they they only needed to keep payslips for six months.
  • Centrelink’s process for cross-checking the employer details provided in a challenge was severely flawed. Consultant Justin Warren, who has analysed that process, described is as mind-bendingly stupid.

The system failed to match the word “Proprietary” with “Pty”, for example. And Centrelink assumed that its own data had been entered correctly, meaning that a simple typing mistake one fortnight would ruin the data-matching process.

“Anyone who designs systems that deal with data should be aware of the issue of data entry errors. If you’re smart, you try to ensure the highest quality of data at the point of entry as you can, because every time you manipulate the data, you risk adding errors. If your data starts off messy, then it just gets messier later on,” Warren wrote.

“Which is exactly what is happening with Centrelink’s data matching.”

Most of these problems can be put down to mere stupidity and incompetence. But two issues would seem to indicate an ethical vacuum.

  • Centrelink failed to plan for the volume of challenges coming in. According to the website #NotMyDebt, which has been tracking the debacle, “Centrelink will not allow you to speak to their in-office staff regarding the debt, and insist that you lodge your appeal online (the website’s reportedly frequently offline) or via the telephone (people report it is very hard to get through).”
  • While Centrelink has allocate resources and priority to chasing down overpayments, no attempt has been made to identify underpayments.”

“Everyone agrees that the government has an obligation to identify any overpayments. But what has changed is that the Turnbull government has removed the human oversight and let loose a poorly designed computer algorithm, effectively outsourcing the verification to the recipient,” #NotMyDebt writes.

This are precisely the issues discussed by Nigel Phair elsewhere in this issue of DirectorTech.

The BS at the core of big data, and its risk

“Given the rise of big data as a socio-technical phenomenon, we argue that it is necessary to critically interrogate its assumptions and biases,” wrote danah boyd and Kate Crawford in their 2012 paper Critical questions for big data.

“Too often, big data enables the practice of apophenia: Seeing patterns where none actually exist, simply because enormous quantities of data can offer connections that radiate in all directions. In one notable example, Leinweber (2007) demonstrated that data mining techniques could show a strong but spurious correlation between the changes in the S&P 500 stock index and butter production.”

They derided what they called the core mythology of big data:

”The widespread belief that large data sets offer a higher form of intelligence and knowledge that can generate insights that were previously impossible, with the aura of truth, objectivity, and accuracy”.

One clear example is the so-called “talent analytics” used in recruitment. The idea is that if you analyse the behaviours of potential employees — whatever data you can get — it will tell you who’s the best for their needs.

The popular name for such statistical analysis right now is “machine learning”, and one of the problem with machine learning is that you can’t always know exactly how it’s reaching its conclusions.

Has the algorithm included data on race, sex, disability, age, or marital status, putting you in breach of anti-discrimination laws, for example?

Big data ethics is one for the board

Clearly, there are risks involved when an organisation appoints computers as managers, operating according to algorithms and a big pool of data.

In nearly every corporate conference on big data I’ve been to, speakers have acknowledged the ethical issues too. But what are they doing about it? I regularly ask them a key question.:

What organisational structures and processes do you have in place to ensure that those issues are on the table at every stage, if any?

Usually the answer implies that the organisation may well have a compliance committee that looks at corporate compliance committee which looks at regulatory issues such as privacy and consumer law.

But the answer rarely mentions any specific policy or process.

If your organisation is already rolling big data and algorithmic systems, then the time as come to include ethical considerations in the mix.

Disclosure: Stilgherrian travelled to Ho Chi Minh City as a guest of the Asia-Pacific Network Information Centre (APNIC), whose conference was held in conjunction with APRICOT.

Additional Resources