An empirical study of obsolete answers on Stack Overflow [pdf]

awinter-py · on June 12, 2020

UX for fact checking, arbitration, correctness + staleness will be critical to the survival of information societies

SO not having a good way to retire these is the same issue as twitter's 'get the facts' banner

IMO wikipedia is the only organization that's solved it, and my sense from the outside (I've only written 1 article) is that it's with editor-gatekeepers, not with tools (though there are some bots)

swalsh · on June 12, 2020

Wikipedia demonstrates that an authoritarian governance model is more functional than essentially anarchy. But that does not mean we should settle for it. Wikipedia is known for its restrictive culture. A lot of good information does not make it past the guards. We can probably do better.

bawolff · on June 12, 2020

Wikipedia is not really an authoritarian model. You can't be appointed owner of an article or anything.

That's not to say there aren't rules, but rules are decided by collective decisions (or in practise the collective decisions of people from 15 years ago) and are implemented based on argument, debate and evidence. Usually when people say an "authoritarian" governance model, they mean there is a ruler or small class of elites, and whatever they say goes, no discussion, no debate, no appeal. That is very much not what wikipedia is like. It is however what a lot of social media sites are like (good luck appealing a fb moderator decision)

bondolo · on June 12, 2020

In my experience fighting the Wikipedia bureaucracy to right a wrong decision is like fighting bullshit, you expend 10x the energy you want just to engage in the fight. After being relatively prolific for several years on non-controversial reef fish topics I just gave up after multiple run-ins with the administrative hierarchy. The assumption of good faith was a sham and accountability was zilch. I had no interest in spending time fighting what were essentially capricious and arbitrary decisions by drive-by tin pots. It was hardly surprising when a couple years later one of the figures who had been an annoyance for me was finally booted from the organization of a pattern of similar behaviour across the site that went on for more than a decade. They had repeated warnings and essentially kept reverting to being a jerk. I wonder how many contributors were chased off by this one bad apple. If you want to make significant contributions to wikipedia but don't want to participate in the governance then you can expect occasional small-town crooked cop behaviour from the same kind of bullies who get off on that in the real world.

Typical would be; you make edits to a stub species article to add basic information from fishbase. You don't add much because you have never actually heard of the species. The next day you come back and the article is reverted or proposed for deletion as non-notable. It doesn't matter that the stub had been around for a decade, that there are 10k other stub fish articles. Nope, you tried to make a small contribution and that deserves to be punished for some reason. Eventually people get the message and stop contributing anything. Don't even get me started on trying to correct taxonomic names to the recognized name or fixing the list of species in a genus. These changes were regularly reverted without explanation. Fix 20 articles, 1 or 2 are reverted. Yep, just live with it.

bawolff · on June 13, 2020

To clarify, i'm not saying Wikipedia is a utopia, just that its not authortarian. I'd probably describe it as a beurocratic-anarchist-doocracy

specialist · on June 12, 2020

"The assumption of good faith was a sham and accountability was zilch."

Is there any way to assess editors, how good of job they're doing?

bawolff · on June 13, 2020

There are ways to engage in dispute resolution. More court system than report card. Actually getting results can be difficult and involves lots of politics (or so people tell me, i haven't actually been involved in major disputes at wikipedia)

Assesing editors implies that someone is an authority above editors, to judge over them. There are of course moderators ("Administrators"), but the general idea is you are supposed to appeal to your peers and convince the group of your position, not convince someone in authority.

bondolo · on June 13, 2020

No formal process is described but to be accountable actions taken should be defensible or explainable and generally they were not. There was also usually no evidence that the reviewer had any basis for making the assessment which led to their conclusion.

specialist · on June 13, 2020

Thank you u/bondolo and u/bawolff.

I'm fascinated by governance, transparency, accountability. For instance, I've been casually researching how to measure effectiveness of state legislators, how the misc rules impact the games they play.

Learning how wikipedia does things, for better or worse, is on my todo list. I'm hoping there's works like Peter Hintjens' Social Architecture for wikis. http://hintjens.com/books

bawolff · on June 13, 2020

A starting place to see how the sausage is made might be https://en.wikipedia.org/wiki/WP:SIGNPOST which is an internal newspaper type thing (although recently its not quite as well regarded as it used to be). Usually it has summaries and links to the big disputes and discussions, especially anything going through arbcom (sort of like an internal court system) Of course big disputes tend to be a bit different than "small" disputes.

Wikipedia has all sorts of internal pages documenting these things, E.g. https://meta.wikimedia.org/wiki/Wikimedia_power_structure and https://en.wikipedia.org/wiki/Wikipedia:Polling_is_not_a_sub...

Also, much of the complaints/dispute pages are very public and you can read them e.g. https://en.wikipedia.org/wiki/Wikipedia:Administrators%27_no... https://en.wikipedia.org/wiki/Wikipedia:Administrators%27_no... https://en.wikipedia.org/wiki/Wikipedia:Arbitration/Requests

specialist · on June 15, 2020

Terrific. Thanks for the links, and the edit. :)

After skimming these, and poking around, I haven't found any metrics for participants. Like applying Moneyball notions to wikis. A very effective, popular example is GitHub's contribution visualization.

From the hip, I imagine sparklines for various metrics. The badges are nice, but depicting the timeline is important feedback. Like how many articles an editor helped shepherd to publication over time. So it'd be more obvious if editors are gatekeeping, how many of their decisions are appealed, how many decisions they lose, etc.

Frankly, the wikipedia governance is overwhelming for a noob.

Unsatisfied with most depictions of how bills become laws, I created a graph to represent the tortured path each bill takes. (I really need to stay off HackerNews and work on my projects instead.) Imagine one of those American Gladiator course challenges as a flowchart.

Again, thank you.

belorn · on June 12, 2020

While it is know for being a restrictive culture, I suspect most of it is because the people making the decision is visible. If we compare the difficulty of getting an article accepted on Wikipedia with submitting an article here on HN and getting people to discuss it and making it visible on the first page, the later only has an invisible visibility score of upvotes and flags. If an article on HN get killed or buried you don't know why or who the people were that made the collective decision. On wikipedia the people responsible for the decision is right there, and it feel much more personal getting rejected by a person rather than by an invisible score. And yet, I would definitively claim that getting something accept on Wikipedia is much easier than getting something accepted by the HN community.

So I am not sure that the guards of wikipedia is that bad. A lot of information does not make it past the guards, but it is also something which communities desire. If you don't have moderators you have upvote system. If you don't have upvote system you have algorithmic models of suggestions and recommendations. If you don't have that you have a search function that order by some kind of ranking.

PopeDotNinja · on June 12, 2020

Wikipedia is more restrictive than I’d like. It’s also pretty good considering how hard it is to have an audience of everybody. I don’t know how it could do better.

DaiPlusPlus · on June 12, 2020

I remember when Wikipedia first started cracking-down on “Trivia” sections. And yet sites like Everything2 still died.

What if Wikipedia allowed articles to have two versions: the normal, policed, articles we have today - and adding a “post anything, as long as you provide citations” version for sticking all the trivia and extra details that would normally not be allowed on a main Wikipedia page.

bawolff · on June 12, 2020

What would be the benefit of that over just having a different site dedicated to that.

I think there is plenty of room for all types of sites on the internet, but they don't have to all be hosted by the same group.

dtech · on June 12, 2020

The talk page functions as the latter for contested issues at least

nailer · on June 12, 2020

### Update for 2020:

I'm a correctness answerer on Stack Overflow. Top 0.14% on the site. There are established ways to overthrow correct but outdated answers (I've just used one). They have emerged organically through the community. This is mainly through necessity, especially in the JavaScript world where the standard library was missing, and now includes, nearly everything one would expect in a modern programming environment and much more.

However it would be nice for highly upvoted answers that replace existing marked correct answers that are no longer getting votes to be marked as the 'community accepted' answer.

shkkmo · on June 13, 2020

> There are established ways to overthrow correct but outdated answers (I've just used one).

What are they? I've tried to edit accepted answers that contain inaccuracies or outdated info and faced difficulty getting those edits approved. It is a time consuming and frustrating process in my experience.

techbio · on June 12, 2020

This is a “cache-invalidation/naming things” problem :)

wool_gather · on June 12, 2020

I doubt it's anywhere near that hard.

SO has crowdsourced "this is correct" (and "this is wrong") baked into its core in its voting system. One more button for "this is obsolete" would do a world of good. It wouldn't solve the whole thing -- there would be some subtlety that would be lost. But it would be at least as good as the "this is correct" signal that we already have.

Unfortunately the company has basically lost interest in its public site as a knowledge archive in the last 5-6 years and has stopped even considering problems like this, let alone working on them.

BrandoElFollito · on June 14, 2020

Could you expand the last part on SO having lost interest in their public site(s)?

gbear605 · on June 12, 2020

With Wikipedia, it’s a combination of editors that monitor articles and just the sheer number of people who looks at Wikipedia every day. Even if 1% of those make a change on a given day, that’s a huge number of small modifications.

tomrod · on June 12, 2020

My colleagues and I call this "true-ing" -- crowdsourcing for new intel, establishing trust metrics with open doors for new participants, essentially giving a channel for bull* to be called on bad data.

sojournerc · on June 12, 2020

I also think about all the true things that have been published never to return because they were against the orthodoxy or accepted thinking at the time.

Easier to validate what's been carried forward, much harder to find what should've been but wasn't.

payne92 · on June 12, 2020

What are the issues with having a "mark obsolete" flag that users can check? (with an optional comment)

At a minimum, that would be an input to the presentation ranking -- old, flagged items would drift to the bottom.

Long-tail "Floatsam and jetsam" content is a huge problem, generally, not just for software development information.

lucb1e · on June 12, 2020

General agree with the "mark obsolete" button, but I don't think the comment should be optional. If it's optional, you could mark anything as obsolete and you shift the burden of proof to the author (who may be long gone) or some community member to jump in, which sounds ripe for abuse to me. Rather, you should add why this is no longer current and let people with enough rep points verify it, similar to how editing someone else's post goes into an edit queue if you don't have enough reputation.

Might be relevant to mention that I'm quite active on the security stackexchange and regularly review the suggested edits queue (we don't have a constant backlog like stackoverflow does). Feel free to point out if you think this is not a nail for my hammer.

hinkley · on June 12, 2020

Sometimes people get stuck on an old version of a tool chain. For them, stackoverflow may be the last option they have because Google only ranks links for the new versions.

“Obsolete” isn’t a flag, it’s a version number, or even a range. This solution doesn’t work with 3.0. This one is deprecated in 3.5.

But since semver is neither universal nor infallible, you’d have to actually model languages and libraries, with a curated list of version numbers. Which is awkward when you built your entire categorization system on tagging with strings instead of modeling problem domains.

jrumbut · on June 12, 2020

I think for obsolete, you should give a link to the updated version rather than a comment (like duplicate is today) or it could be a vote that is balanced among other signals rather than a cause for deletion or deep archiving (since obsolete systems maintainers need help too).

I tend to think the real problem is the overly strict conception of duplicate. Over time, the way people will ask a question and the way people will answer it changes.

5-10 years ago almost every JS question was a jQuery question too, now not so much. As someone who lived through that I can very easily translate to the less jQuery-centric present, but someone who started learning JS/React last week can't. A new rendition of such a question/answer would be a duplicate for me, but the old one would be obsolete to the new developer.

I think the best way forward is that both duplicate and obsolete should be soft signals rather than reasons for closing.

crispyambulance · on June 12, 2020

> I tend to think the real problem is the overly strict conception of duplicate.

It's true, people are so damn trigger-happy marking questions as duplicates.

I've seen new, well-posed questions on up-to-date frameworks get marked as dupe because 10 years ago someone asked a related question on an obsolete tech. The reason it was marked as dupe was simply because someone took it upon themselves to write up a sprawling smug "canonical" answer to a shitty old question that happened to cover the new subject matter.

It's much better to keep the old questions and answers, to just answer each question (and no more), and to create new questions as needed. Why not? it's not like they're running out of disk space.

I think the solution here is to encourage specific answers to specific questions, let folks sort out the historical minutiae based on timestamps and subject. Anything more elaborate is asking for mix-ups and confusion.

misnome · on June 12, 2020

> The reason it was marked as dupe was simply because someone took it upon themselves to write up a sprawling smug "canonical" answer to a shitty old question that happened to cover the new subject matter.

This seems an optimistic view. What feels an awful lot of the time it just has some of the same outside appearances, and is a different question completely, but the people marking it as dupe don’t read it carefully enough, or don’t know enough about the subject to realise the differences.

jrumbut · on June 12, 2020

I will say, compared to the past, it's gotten better. They seem more likely to let a good effort possible duplicate go now.

Still, it is my quixotic StackOverflow crusade and there remains more to do.

lucb1e · on June 12, 2020

> I tend to think the real problem is the overly strict conception of duplicate.

I see where you're coming from, but having identical questions exist alongside each other just because their dates are different does not help those who follow a link to the older question. A new mechanism to indicate different versions would have to be added for this not to be confusing, if this is the solution we want to go with.

jrumbut · on June 12, 2020

The idea is that soft obsolete/duplicate is used to make one side of the link more visible and the other side less visible. It definitely requires the addition of a new mechanism, or at least an adaptation of existing mechanisms.

When you search how to solve the hypothetical JavaScript problem I mentioned, the classic highly upvoted jQuery version of the question is up top, but if you can't see how that addresses your issue you can keep digging (or keep asking) for a version that makes sense to you.

Spammy, low effort, almost character for character duplicates can still be removed. I certainly acknowledge that the line is not always easy to define, but I think giving the reader what they probably want fast and then letting them sort through the long tail if needed is the approach better suited to programmer Q&A where the ability/background range is huge and database searching skills are above average.

fabian2k · on June 12, 2020

That is probably one of the things that will be necessary to help with this problem, but this kind of feature is always only a part of the solution. And it's far more complex than just having a single flag.

Obsolete can mean a lot of different things, and there are degrees of obsolete. And people still use older versions of technology, so in some cases you might want to look for older solutions anyway. So it would likely have to be more like a version flag.

Now you need to get some people to curate that information and properly apply the version/obsolete tags. That's probably easy for some of the more often searched for posts, but very difficult for the long tail of answers. You need to educate the community on how this new feature works, and when to apply this flag. You need to decide on who can set the flag, whether you need multiple votes and how to handle disputes when people disagree or set it wrong.

If you decided that a version flag is needed, and not just an obsolete flag, you need a UI and people that manage the available versions for each programming language/framework/library.

You need to decide how to handle the same question in multiple versions. One question with multiple answers and each answer tagged with a version? A question per version? Do you actually want to enforce one variant, or allow both to exist? Questions can also be obsolete, and that is often in a more complicated way compared to answers. Should that be handled with this kind of flag as well?

This is not a trivial change, but something along these lines is probably necessary.

bryanrasmussen · on June 12, 2020

Although I hardly ever find anything useful on StackOverflow anymore I think this is maybe partially overthinking the problem - for example in the case of versioning - if I have a question but I find an answer that does not work and is several years out of date if i were clever I would do something like say

How do I solve problem X using version Y of Z.

In some cases people encounter older versions of technology questions and make a new answer saying updating for version X of the technology - I know this happens because I did it myself for a Gulp question and got a good number of upvotes - even though I could never actually get the accepted answer of course.

ARandomerDude · on June 12, 2020

Part of the issue of age is many of us work on "obsolete" systems. Being able to troubleshoot and get help on legacy codebases is as valuable for me as when it was new 15 years ago.

asdff · on June 12, 2020

At the very least stack overflow should require the version you are using. Python 2 and 3 is a mess on the website, as someone who isn't very great at python and can't easily spot the two versions. Even reading about inane stuff like tmux configs on the web is pretty crap since that syntax has changed several times as well. Version numbers should be required everywhere.

imadethis · on June 12, 2020

My guess is a greater percentage of users find SO questions from organic searches or third party links rather than within SO itself. An obsolete flag would need to be prominent to dissuade users from following answers anyways.

I think requiring a comment or a link to a more up-to-date answer would be nice, to avoid answers being marked as obsolete without any recourse.

lucb1e · on June 12, 2020

> An obsolete flag would need to be prominent

The way I (not the person you replied to) imagine this is that the post goes to the bottom and gets a red background color or is faded out, similar to how deleted answers are shown in red (if you have enough reputation; in this case everyone should be able to see these) and downvoted posts are faded out.

Edit: this is the background deleted posts currently get, for those who aren't active on the SE community: https://i.stack.imgur.com/EDAIF.png

not2b · on June 12, 2020

But movement of a post to the bottom doesn't help users who find it directly via a Google search. If it was a high-quality answer in 2017 and is badly wrong now, it may still be a highly ranked URL in Google's database. So the obsolete flag would need to prominent for people who follow a direct link the answer.

lucb1e · on June 12, 2020

Did you also see this part?

> and gets a red background color or is faded out

And most of the time, a newer answer is available (at least in my experience) so for those the downranking would also help.

abathur · on June 12, 2020

I think that would be a step in the right direction, but it may not help users arriving by search engine. I can imagine 3 things that might help:

1. Add an age-weighted score (and maybe sort by it) to make it easier to distinguish between an obsolete answer with hundreds of old upvotes and a recent answer with 10 recent upvotes.

2. Similar to above, but add an extra vote lever and visible score (perhaps conditionally, to older questions that stop getting upvoted?) for marking that an answer didn't work for you, or that you suspect it is obsolete, without having to downvote an answer that was given in good faith and worked for some time (and may still work for older versions/environments).

3. Add an option to open a superseding question (perhaps conditionally based on changes in absolute or age-weighted scores; perhaps it must be "community" owned but gets to inherit question upvotes) that has a special relationship with the original question and triggers extra UI on each side (to cross-link the questions, encourage users to directly mark which answers from the original work, and to provide feedback that affects when/whether the superseding question is treated as the canonical version).

asdff · on June 12, 2020

A lot of this can be automatic, like flagging all python 2 syntax on the website as python 2. There can be confusion visiting a thread, looking at the year, and trying to figure out which version of python might have been used if you aren't very familiar with python 2 vs. 3, which would be most people coming to stack overflow I expect.

thewebcount · on June 12, 2020

Oh man, I would love this. It's the same for Swift (maybe even worse, given the proliferation of versions over just a few years). But those sorts of things aren't generally automated on SO. They're done manually by volunteers. I did some for a while, and it's quite tiresome.

brudgers · on June 12, 2020

On StackOverflow, anyone can edit any question. Anyone can edit any answer. If someone sees a problem, such as obsolescence they can fix it. Flagging "obsolete" doesn't fix anything. The person flagging is one of the someones in a "someone should fix this" flag.

nix0n · on June 12, 2020

> On StackOverflow, anyone can edit any question. Anyone can edit any answer.

Neither of these things are true. In fact, to a user with zero reputation, StackOverflow is much more locked down than Wikipedia.

But the real issue with obsolete questions, is that a new question will be marked as a duplicate of an old obsolete question and closed before anyone has a chance to answer it.

So the real value of an obsolete tag would be that a new question couldn't be marked as a duplicate of an obsolete one.

brudgers · on June 12, 2020

Low reputation users' edits get reviewed. A zero reputation user is no more likely to have flagging privileges than other privileges. Optimizing for the zero reputation user in regard to obsolescence doesn't seem like a productive approach to me anyway.

jbay808 · on June 12, 2020

The person flagging is most likely to be someone who doesn't know the updated answer, which is probably what brought them to that SO page where they discovered the answer marked correct no longer works.

Someone · on June 12, 2020

Sounds reasonable, but I would expect wars over whether, for example, python2 is obsolete.

pelasaco · on June 12, 2020

that should be thought through.. for example, you couldn't/shouldn't mark a question as repeated, if it was marked as obsolete, even though the question per-se isn't obsolete, just maybe the "right" answer.

kevin_thibedeau · on June 12, 2020

There doesn't seem to be any remedy to the problem of poor, obsolete, or outright wrong answers being selected as the checked answer when the asker disappears. I frequently have to scroll past the approved one because there's often gold hiding below it.

MrZander · on June 12, 2020

I ran into this yesterday. I got an upvote on an answer I posted 7 years ago that was marked accepted. There was a much better answer with 3x the upvotes at the bottom of the list and the OP account was no longer active.

Wondering if there was a protocol for changing the accepted answer, I searched meta. I found the consensus is: 'Accepted' is at the sole discretion of the OP and shouldn't be mean it's correct, just that it answered the OP's question. Which I think is BS as it elevates the answer to the top of the list and gives it credibility.

Why is the OP the czar that chooses the correct solution just because they asked the question first? Honestly, the OP is often the _least_ qualified person to validate an answers correctness.

sgillen · on June 12, 2020

Sort of think there should be two systems, and OP selected answer, and a community selected answer. This is already what happens to some extent, it’s just a matter of UX I think to clearly mark the answers the community thinks are good with something akin to the green check you get (and I guess more importantly move these to the top)

I’m not sure this matters too much for experienced devs, it’s really just a noobie trap to only look at a low upvote but chosen answer right?

MrZander · on June 12, 2020

I like the two system approach, that is a good idea.

Perhaps it is just a matter of properly ordering the answers, maybe giving community votes precedence over the accepted answer?

For example, here is the question I was talking about: https://stackoverflow.com/questions/11970586/apl-removing-el...

My answer should not be accepted, I was brand new to APL when I answered it. The "correct" answer not accepted and buried below two 0 vote questions.

The thing is, my answer _works_, just not well. I can easily see myself overlooking a better solution in a case like this. Maybe I'm just lazy though haha.

DaiPlusPlus · on June 12, 2020

SO sometimes puts the highest-voted answer at the top when the accepted answer has fewer votes.

Buttons840 · on June 12, 2020

Tangentially: It makes sense that the asker of the question gets to choose the answer that worked for them, but by doing so the idea of "duplicate questions" is invalidated.

For example, I could very well ask a new question which ends up being flagged as a duplicate of another question whose accepted answer does not work for me.

Can two questions be duplicates if they have different answers? Do I only have to claim that the accepted answer does not work to show that the new question is unique?

tonyedgecombe · on June 12, 2020

The real problem is the asker isn’t knowledgable enough to select the correct answer in many cases. I’m not sure of the value this feature.

pelasaco · on June 12, 2020

that's something was long discussed on Meta, specially to answer the question: "Is it fair to down-vote answers which were right in 2009, but aren't right anymore?" and the answer was "Yes, if you want to keep your points, you should maintain your answer as long as it exists" which is a kind of no-go for somebody like me with more than 14k points and more than 500 answers...

thomascgalvin · on June 12, 2020

> "Yes, if you want to keep your points, you should maintain your answer as long as it exists"

This is just one more example of the absolutely toxic culture within Stack Overflow itself. Everyone uses it, but we all get to it from Google.

Nobody I know bothers answering questions, and I don't think more than a half-dozen people I know have even submitted questions. If you do, the gatekeepers are going to jump down your throat.

NateEag · on June 12, 2020

I've answered a few questions casually on StackOverflow, when I ran into something that was unanswered and I knew what the answer was, but I've never invested seriously in it, as you can see from my profile:

https://stackoverflow.com/users/1128957/nateeag

I've also never had a gatekeeper jump down my throat.

I think you are overstating how bad SO's culture is.

I have definitely seen problems there, but they aren't the whole of the story.

(Though SO itself has entirely lost my goodwill due to the Monica Cellio incident: https://meta.stackexchange.com/questions/342039/firing-commu... )

root_axis · on June 12, 2020

How is it toxic? It seems pretty logical to me. If the answer is wrong in its current form then the answer's score should decay relative to others. It doesn't matter if individual users lose upvote points, a user's points are meaningless in aggregate, but on an individual answer they function as a signal for proximity to correctness. If the points are important to you then the burden should be on you to maintain the correctness, we should not enshrine wrong answers in perpetuity in order to appease individual upvote counts.

dgb23 · on June 12, 2020

Is the incentive there?

Both fact checking and editing old answers should lead to points; somehow.

Also maybe there is merit in adding structured data (and incentivizing it), which could be used to notify users/writers. For example adding version numbers could be a low hanging fruit.

JoeAltmaier · on June 12, 2020

Might work if down-voting were trackable. You'd know which answers quit making sense for readers.

lucb1e · on June 12, 2020

I see which posts of mine were downvoted, if that's what you mean. If someone downvotes I always revisit the post (also most of the time if someone upvotes actually, as I often find things to improve).

pelasaco · on June 12, 2020

well you can track your down-votes, but is it fair to down-vote something that was right in 2007, but not right in 2020 anymore? Maybe a mechanism to lock old right answers, and open the question again, I'm not sure. That's probably an issue that the Stackoverflow team didn't think about back in the days..

the_af · on June 12, 2020

Why is it unfair? I think SO should care more about the quality of questions and answers than about the reputation score of individual contributors.

This is a downside of gamification though: it becomes all about ego and earning points, which I don't think should be the goal of a site like SO. If keeping the quality of the content means some people must lose rep points, then so be it. If they have 14K rep they won't notice, anyway.

detaro · on June 12, 2020

A "this is outdated" flag thats independent of downvotes would achieve both goals.

the_af · on June 12, 2020

Maybe. It should be coupled with a mechanism to encourage more up to date answers, if possible a link to a better answer if it exists. Also a way to encourage organic searches from finding more up to date answers.

The flag should act so that it isn't ripe for abuse (e.g. people flagging out of spite, without evidence).

Downvoting does have the side-effect of poking the original author into action, because gamification often means (regrettably) that people do care about rep points.

edit: an argument against down/upvoting: I think there is a sizeable "bandwagon" effect. People upvote what already has upvotes, and downvote what already has downvotes. I can't prove it but I'm pretty sure this happens. If so, obsolete answers will never go away by downvoting alone.

pelasaco · on June 12, 2020

it is unfair, because the answer isn't necessarily wrong. It's just outdated. If somebody asks "How to do X in rails", and your answer isn't valid in Rails 6 anymore it's not necessarily wrong if it still works with Rails 2.3.8.

lucb1e · on June 12, 2020

One existing mechanism to attract new answers is opening a bounty.

I'm not a big fan of the system, but just adding this to the discussion because I think it's the closest thing stackexchange has to an official way to re-open a question for answering. One of the reasons you can select for the bounty is something along the lines of "current answers are outdated".

bcrosby95 · on June 12, 2020

Kinda funny. You thought you were answering a question. What you were actually doing was volunteering to maintain a wiki page.

the_af · on June 12, 2020

You are not mandated though. If you don't want to maintain it, just accept the downvotes. If you have 14K rep, getting downvotes from a couple of obsolete answers is no big deal. When it starts to become a big deal, you can always do something about it -- or alternatively, you can opt to do nothing about it and accept the rep hit, again no big deal.

cutemonster · on June 13, 2020

That's from your perspective

> downvotes ... is no big deal

Some ppl feel sad and anxious about downvotes. The no big deal solution doesn't work with everyone.

pelasaco · on June 12, 2020

interesting point. I fully agree with that. Every answer is a wiki page. However I strong believe that if the question isn't specific about versions, an answer to an old version of code (gem/pip/etc version X, instead of master) still right.

Example if for the question "How to print to stdout with python", the answer " print 'foo'" is right, you just have to use python2.. it is maybe staled, outdated, but not wrong.

boomboomsubban · on June 12, 2020

Would you care if suddenly 30 of your answers were now incorrect and you logged into 13k points?

pelasaco · on June 12, 2020

Unfortunately I care, if a question is generic like "How to test an attachment with Rails and Paperclip". My accepted solution form 2010, still valid, if you download the same version of Rails and Paperclip used back in the days. I personally get angry when people down-vote it and answer it 10 years later saying, in Rails 6, that's wrong..

root_axis · on June 12, 2020

Only if the software in question has not been updated much since the question was asked, or your answer specifically includes version numbers or is specific enough that it can be easily deduced that the answer only applies to a specific version. Otherwise, the answer is wrong, because it doesn't make sense to expect someone asking such a generic question to be running a decade old version of the stack and in almost every case they'd be worse off if they opted to use the decade old version of the stack just to make use of your answer.

boomboomsubban · on June 12, 2020

I can understand why that's annoying, but I asked how you would feel if your answer was incorrect. Do virtual points matter more than helping people find useful answers?

staycoolboy · on June 12, 2020

The call to action, section 5, is exceptional for an academic paper. I'm surprised the first suggestion doesn't already exist on S.O.

Honestly this is my biggest complaint with the web in general: immortal anti-information. Proposing analysis and strategies for combatting it on curated platforms is a great first step.

I think also implicit in this discussion is the role of the readers to vote with their mouses, so to speak. Without feedback from users, the mechanisms can't work effectively. Which is why I try to upvote as much as reasonable on HN and SO.

boomboomsubban · on June 12, 2020

>Without feedback from users, the mechanisms can't work effectively.

How positive are you that your knowledge is current before you vote? Users will confirm the information that they know, which is just as likely to be outdated.

staycoolboy · on June 12, 2020

Good question. All I got is ... law of averages? ;-)

Damn.

j4ah4n · on June 12, 2020

I kind of feel the same issue is being expressed in search engines as well. As time progresses, more relevant answers are moving down the list. Using Google for example, I'm finding that I have to employ the "Tools -> Time Range" filter to get better, more relevant results.

ape4 · on June 12, 2020

It needs to be versioned. Maybe you are working in an environment with an old C++ compiler that can do the latest C++ tricks. You want best practices from state of the art 10 years ago.

the_jeremy · on June 12, 2020

I have answered >100 questions on SO because I like answering questions. I would answer significantly fewer if I was required to give the range of versions that my answer worked on, because I only know that it works in my environment. If I could give just one version, then others would have to keep asking if my answer was best practice on their version. I suppose listing at least one version it works on is better than the current setup, though.

asdff · on June 12, 2020

The current setup is entirely obtuse. I suck at python. Sometimes I find and answer that solves my problem, and low and behold, it's python 2 and I have to read up on the changes to that function between the two versions and see if I can even use the answer as written. If there was just some little tag in the op question (Python=2.7), that would save me hours. Multiply that anecdote by the mountains of novice traffic the site receives, and you can see how a lack of versioning makes a lot of answers worthless for a lot of people.

For your concern, if op said they needed help with version 2.7, presumably you'd write your answer with that in mind. Or you would say how you got it working in 3.8, and exactly how you set up your environment. If someone asked you about another version, you can throw your hands in the air and say "I don't know, but it works in 3.8 with these packages installed," and that would be a perfect response that shows others how to reproduce your work. Reproducibility should be standard practice, and you shouldn't be reliant on dubious context and dates and guesswork to reproduce an example in a website devoted to technical help.

ringshall · on June 12, 2020

It might be helpful to have the age of comments listed as part of their metadata, ie alongside the date the comment was posted. Some formatting could be added (eg red highlight for comments > x years).

I know this sort of feature is useful on newspaper websites - The Guardian will flag stories older than some limit as being potentially out-of-date.

lucb1e · on June 12, 2020

> It might be helpful to have the age of comments listed

That's... currently the case?

> red highlight for comments > x years

I don't find age has a 1:1 correlation with it being outdated. If some advice doesn't make sense to me, I'd look at the dates of this and other answers, because most often there will be newer answers (lower voted because they haven't existed as long / aren't seen as much) and/or comments added to the answer indicating how to do it in python3 or whatever the new thing is.

Sometimes posts from 2009 help me, sometimes posts from 2018 are outdated. Maybe this could work if a time limit is configured per tag, but even then, I expect it wouldn't be very helpful.

ringshall · on June 12, 2020

>> It might be helpful to have the age of comments listed

> That's... currently the case

Is it currently the case? I'm not seeing it, though it I may be missing it.

I do see, at the bottom of comments, something like

:: edited Apr 23 '15 at 8:40 / answered Mar 11 '09 at 21:11

The date the question was /asked/ does have an age, though, which may be what you're referring to. For the problem at hand, it's the age of the answers that matters more than the age of the questions.

> I don't find age has a 1:1 correlation with it being outdated.

No, but there is a correlation.

elliekelly · on June 12, 2020

I’m definitely in the minority here but when I’m learning a new language I kind of prefer to use a slightly outdated resource. When you get stuck and look for help you get the gist of the answer you’re looking for but not the solution and then you have to figure out the rest. It’s like a hint that gets you 50-90% of the way there.

When you’re in the middle of work and not actively trying to learn new material I’m sure the obsolete answers are frustrating but I don’t think the outdated information is entirely useless.

oblib · on June 12, 2020

I agree with the conclusion of the study (and payne92's suggestion here) but I haven't found obsolescence to be a huge issue because I generally take the time to review all the answers provided and the comments on them. Often times obsolete answers are noted in the comments.

But there is room for improvement.

andersco · on June 12, 2020

This seems to imply that answers are binary, either obsolete or not, while in my experience they are often are only partially outdated. How is that treated in this study? Additionally, I find that for high traffic answers someone will often have posted an update to the obsolete response.

minimaxir · on June 12, 2020

Per the paper, they are using an SO archive dump from 2017, which is ages ago in internet time, although admittingly the problems with SO comments extend even before that.

It looks like the latest archive dump (March 2020) is available in BigQuery, e.g.: https://console.cloud.google.com/bigquery?p=fh-bigquery&d=st...

lucb1e · on June 12, 2020

While I like the irony of this paper using an outdated dataset, I don't think much changed on stackexchange since 2017 in this regard.

AndrewKemendo · on June 12, 2020

How could SO incentivize the community to clean that up?

Given that SO's content is basically 100% volunteer/user generated and free to access, it seems like the first step would be to allow users to flag obsolete answers with a very visible and obvious UI element.

Maybe second would be SO fielding a team of experts as "pruners" that would delete/update the flagged answers.

eitland · on June 12, 2020

Or maybe we could just let content that doesn't receive upvotes sink towards the bottom:

- a fresh upvote is 1 point

- one year old upvote is 0.5 point

- a two year old upvote is 0.25 points

If nobody votes, answers relative positions stay the same.

Old answers aren't worthless but it also isn't impossible to lift updated content to the tip later.

A person might vote for the same answer again after half a year, but it really just refreshes the vote to 1 instead of adding more value to it.

I'm not a fan of deletionism and in stack overflows case I'm fairly certain it has destroyed value for millions both in knowledge and reputation.

(Why? For years around 2009 - 2015 or something whenever you found a good answer that solved your problem it would most probably be flagged for removal or something.)

Avamander · on June 12, 2020

This would not be implemented because you suggested something that makes high-point users lose points.

eitland · on June 12, 2020

They will not do it.

They will also not allow duplicates, allow low effort questions - even if others are lining up to answers those low quality questions.

Whoever wants to improve the world as much as stackoverflow once did - and make a bunch of cash in the process could try some of thise ideas.

Someone will probably counter with the assumed fact that if there had been money in it someone would have done to which I counter with the two economists who went down the street, saw some money on the ground and walked straight past it since "if the money was real someone would have taken it already".

wool_gather · on June 12, 2020

There was an upheaval on the Stack network over the last 10-12 months; part of the fallout is a lot of users pulling together on a non-profit take on QA: https://codidact.org

Still in the early stages, but it's promising.

eitland · on June 12, 2020

Thanks, looks promising.

The actual instances were somewhat hidden as far as I could see.

rozab · on June 12, 2020

Are you suggesting that SO actually pays people to create the content their site thrives on?

nogabebop23 · on June 12, 2020

In fairness to SO it was envisioned and originally created as a peer helping network and all the content is licensed in very liberal ways. The fact that it turned into 99% of the questions asked by noobs / answered by a very small group of experts may have been inevitable but I don't think it was planned. It's one of the few socialized Q&A-style sites that I feel as primarily a question "asker" I have benefited while the predominately question-answerers have as well.

Synergy!

QuasiGiani · on June 13, 2020

This of course(!) sounds as if like it'll like be quite good.

But. I am more interested in the inevitable corollarial follow-up:

An Thorough & Ignominious Probing Investigation Into The Problematic Presentation Of Pseudo-Intellectual Wankery On Hacker News

darepublic · on June 12, 2020

It would be nice to have answers marked as obsoleted and then a little historical breakdown of the greater changes that have come about to make that answer obsolete