3

The hidden insights in developers’ Google searches

 3 years ago
source link: https://bootcamp.uxdesign.cc/the-hidden-insights-in-developers-google-searches-47f05030cd2d
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

The hidden insights in developers’ Google searches

A selection of real search queries, and what they can teach us about designing for developers.

A screenshot of Google’s search page.
Photo by Christian Wiediger on Unsplash

Most of what developers do day to day involves lookups of one kind or another online: referencing documentation, choosing frameworks, investigating errors, checking keyboard shortcuts, and more.

Since online search is such an integral part of our work, it can act as a rich source of data about development practices, highlighting areas where good design can make a difference. A few years ago, I conducted research into exactly that — for two weeks, I collected the searches of 18 developers who volunteered for the study, asking them to explain some of their queries in detail every day.

The aim of the study was to look at search tooling specifically; it was only later that I realised the data might be interesting in a more general sense, too. Plus, a plain English summary of the full report is long overdue…

So, I’d like to offer here a curated selection of the search queries of those 18 programmers. I’ll cover what was sought for (the different informational goals), and how (the strategies yielding the best results). I’ll point out the occasional design implication, but — mostly — I want to illustrate the breadth of contexts and search styles, showing how search logs can act as an intimate window into the daily work of developers.

The data set

All participants signed an informed consent form and were fully aware of their search history being shared for the duration of the study. Their data was kept anonymous and used only for the purposes of the research.

In total, I collected 2488 programming searches from 18 developers over two weeks of tracking. Participants made an average of 15.9 searches a day (min=1, max=83). Of all searches, 347 were annotated, which means the developer who made the search explained the context in further detail. Those 347 annotations were what I based most of the analysis on.

Search categories

I identified six search categories in the data. From most to least popular, they were:

  • Ad hoc how-tos.The developer is in the middle of a task, knows what they need to do next, but doesn’t know how.
  • Understanding an API.The developer wants to learn about the particulars of a language API or any third-party code they’re working with (libraries, frameworks etc), often concerning low-level details.
  • Recalling forgotten details. The developer needs to look up the specific syntax or naming of something.
  • Learning, research and investigation. The developer is exploring best practices, trade-offs and popular opinions, or learning about something new.
  • Troubleshooting. The developer is looking for the cause of an error or how to fix it.
  • Resources. The developer is looking for tools or libraries, or official documentation for those tools or libraries.

Ad hoc how-tos

With almost a third of all queries falling into this category, finding practical techniques for immediate use was the most common search motivation seen.

angular route uib tab Was wondering how to combine UIB library’s tab directive with Angular UIRouter.

react setstate sub property In React, state is immutable so you have to reset the state as a whole, but I wanted to mutate a sub key and set the state in one line.

When unaware of the exact APIs or techniques involved, the developer simply described the desired outcome:

bootstrap button next to input I wanted to align a button next to an input field using Bootstrap.

Understanding an API

This category somewhat overlaps with how-to and recall searches, but what sets it apart is the interest in the official documentation of a particular language, library of framework.

Some searches in this category (like php preg_match) were generic; others were motivated by questions like Is this supported? or How has it been implemented? Examples include:

forcelayout api Was searching for an ability to select parent node in D3 force-layout.

golang copy built in I was trying to understand memory efficiency when [this function is] used with slices.

Less so in other categories, searches here were sometimes prompted by reading code authored by someone else. This illustrates the wonderful learning opportunities hidden in code reviews!

strlen I saw this php function in code and didn’t know what it did.

java comparator interfacejava default string comparator I was reviewing some code written by a consultant and was wondering why they were not using methods/classes from the standard library.

Recalling forgotten details

This one is self-explanatory: the developer knows what she’s doing, but doesn’t remember the specific syntax or naming. With these searches, a quick code example is what people were often after.

ubuntu search packages Searched after the command that’s used to search for programs on Linux Ubuntu.

URI uri = new URIBuilder To make sure the syntax is correct.

java throw exception example Looking for the correct syntax for method error throwing.

People often had a specific website in mind with these queries — they had likely made the search in the past and knew exactly where to look (this could also explain why a single search was often enough in this category). In this example, mdn refers to Mozilla Developer Network:

mdn transform origin CSS transform syntax.

Appearing in nearly 20% of all cases, the notable thing about recall searches was that they occurred as much as they did. Many IDEs are syntax-aware and have built-in documentation. Indeed, without those features the proportion of recall searches may have been even greater. Or, perhaps we’re at peak efficiency — a quick switch to the browser is so ingrained in muscle memory and usually yields quick results for recall searches, so what is the point of a specialised tool?

Learning, research and investigation

Queries here were about generic concepts (e.g., job queue), best practices (e.g., bad things about scala implicits) or trade-offs (e.g., svg vs png). They often occurred when planning a change or embarking on an unfamiliar task:

segmented circle css We’re implementing a new design that looks like a donut cut into 8 parts — wanted to see what was possible with CSS vs SVG.

best programming language for both ios and androidapp dev design, what languages to usecode java app for ios and androidwrite helloworld ios app in java Been tasked with designing an app, no idea where to start.

But there were also instances of casual curiosity:

boilerplate template My colleague was talking about boilerplate templates and that I should check them out. So I Googled to find out what they were.

Troubleshooting

We all know the feeling — something is failing and you don’t know why; annoyed, you copy and paste the error message into Google and hope for the best.

show is not a member of org.apache.spark. sql.GroupedData I didn’t know why I couldn’t show GroupedData. I wanted to find how to do it.

Possible infinite loop detected This was an error message from the percona tools I was using to sync up the replica instance with the master.

Unlike in other searches, the queries here were consistently structured: they frequently either described the problem or used the error message verbatim. It is also here that queries were most often refined, with at least one instance of refinement in about half of the queries:

babel-jestcan’t console log in babel jestlogging in babel-jestjest-cli Thought I was in a different file than I actually was in so when running specs I couldn’t get console.log to work.

Resources

The resources people sought for were most often third-party tools and libraries to integrate into their current project. For instance:

json minify I wanted to find an online minifier tool to reduce whitespace in JSON. Standard JS minify wouldn’t work.

rails kineses gem I’m looking at migrating our analytics pipeline to AWS and so I wanted to see if there were any gems to support this.

The rest were searches for reference tables such as character codes or keyboard shortcuts (e.g., unicode Looking for a unicode table), or things like dummy data generators. These searches stood out for being just as likely to occur during coding as during higher-level planning and research.

Search strategies

We’ve covered what people search for. Let’s now look at the how: the strategies employed when formulating the query and evaluating results.

Choice of search engine

Though most people in the study said they use several alternative search engines, only 5% of searches logged were actually made on something other than Google. DuckDuckGo was the search engine of choice for one participant, and only sometimes did people search directly on specialised sites like Mozilla Developer Network, GitHub or Hoogle.

Multiple services were sometimes used during a single search session, as seen here:

Google: sumo collector filtersGoogle: sumo collector filters includeSumo Logic Community: filters includeGitHub SumoLogic repository: filters I wanted to see if we’d used filters before in sumologic (we hadn’t). Turns out they don’t work and their docs should be completely removed from SumoLogic’s website.

For the ten GitHub code searches that were annotated (out of 92 total), motivations varied from investigating low-level behaviour to using source code as documentation (as seen with filters above).

When it comes to non-browser-based search tools, only one participant had their IDE equipped with web search features. Those who claimed to use offline documentation apps like Dash still made numerous documentation searches online, too.

Query refinement and keyword foraging

As already mentioned, queries were refined the least in recall searches, and the most in troubleshooting searches.

Query refinements weren’t necessarily tied to the type of search, though; they also occurred during keyword foraging, where the person searching doesn’t know what the concepts they need are called. (In the original report, I refer to this as “cross-domain translations”, but I’ve since been made aware of the much more suitable term.) Often, a familiar term is used first in the hopes that it will lead to better keywords:

jquery add to listjquery add to array

mongoid update_attributes Was looking for the equivalent of a Rails ActiveRecord method in the Mongoid persistence ORM

There was one particular case of query refinement where search language itself was altered:

step by step json lexercomme costruire un lettore jsoncomo crear lexer json

The participant noted: “Results in Spanish tend to have more extensive explanations. Results in Italian or French yield fewer results typically.”

The point at which a multi-lingual programmer abandons one language in favour of another can hint at how they evaluate resources. Queries like these also remind us of the importance of considering non-English content in any tool that aggregates resources.

Natural language vs code terms

Comparing the use of natural language to the use of code terms, recall searches included the most code terms across all categories. This was expected, as people looking for reminders usually have a good idea of the specific line of code they’re about to write.

Natural language queries (such as full-sentenced questions) were found similarly in all categories. For example:

does md5sum read file into memory? (Understanding a library or API)

what determines if a site is considered intranet for ie compatibility view settings (Troubleshooting)

javascript library that embed a string and if find an embed content render (Resources)

It seems to me that people sometimes fall back to natural language when the question is highly specific, or expressing it concisely is a struggle. These are the situations where you might want to just turn to the person sitting next to you and ask for help. Perhaps not all questions can be addressed by better IDE or search design.

Social validation

Finally, a prominent theme that emerged from the searches was the importance of social proof. Mentions of “the best” or “the accepted” way of doing things were frequent across categories:

react pass state to child I wanted to pass the entire react state object to a child component, this is fairly simple to ‘just do’ but I was looking for the accepted and ‘safe’ way of doing so.

best process supervisor I was . . . looking for people’s opinion on Linux process supervisors/monitors for managing web processes, workers, etc.

mysql using indexmysql indexes best practices

When integrating search into IDEs, one design direction has been to filter and re-format results to make them quickly scannable, prominently featuring code snippets and removing the messy, non-parsable human discussions around them (see Assieme or Mica). While useful for syntax lookups and the like, it would completely fail when the aim is to gauge best practices, as with the queries above.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK