Why Wolfram Tech Isn’t Open Source—A Dozen Reasons

 3 years ago
source link: https://www.tuicool.com/articles/hit/be6jA3A
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Over the years, I have been asked many times about my opinions on free and open-source software. Sometimes the questions are driven by comparison to some promising or newly fashionable open-source project, sometimes by comparison to a stagnating open-source project and sometimes by the belief that Wolfram technology would be better if it were open source.

At the risk of provoking the fundamentalist end of the open-source community, I thought I would share some of my views in this blog. While there are counterexamples to most of what I have to say, not every point applies to every project, and I am somewhat glossing over the different kinds of “free” and “open,” I hope I have crystallized some key points.

Much of this blog could be summed up with two answers: (1) free, open-source software can be very good, but it isn’t good at doing what we are trying to do; with a large fraction of the reason being (2) open source distributes design over small, self-assembling groups who individually tackle parts of an overall task, but large-scale, unified design needs centralized control and sustained effort.

I came up with 12 reasons why I think that it would not have been possible to create the Wolfram technology stack using a free and open-source model. I would be interested to hear your views in the comments section below the blog.

A coherent vision requires centralized design » High-level languages need more design than low-level languages » You need multidisciplinary teams to unify disparate fields » Hard cases and boring stuff need to get done too » Crowd-sourced decisions can be bad for you » Our developers work for you, not just themselves » Unified computation requires unified design » Unified representation requires unified design » Open source doesn’t bring major tech innovation to market » Paid software offers an open quid pro quo » It takes steady income to sustain long-term R&D » Bad design is expensive »

1. A coherent vision requires centralized design

FOSS (free and open-source software) development can work well when design problems can be distributed to independent teams who self-organize around separate aspects of a bigger challenge. If computation were just about building a big collection of algorithms, then this might be a successful approach.

But Wolfram’s vision for computation is much more profound—to unify and automate computation across computational fields, application areas, user types, interfaces and deployments. To achieve this requires centralized design of all aspects of technology—how computations fit together, as well as how they work. It requires knowing how computations can leverage other computations and perhaps most importantly, having a long-term vision for future capabilities that they will make possible in subsequent releases.

You can get a glimpse of how much is involved by sampling the 300+ hours of livestreamed Wolfram design review meetings .

Practical benefits of this include:

  • The very concept of unified computation has been largely led by Wolfram.
  • High backward and forward compatibility as computation extends to new domains.
  • Consistent across different kinds of computation (one syntax, consistent documentation, common data types that work across many functions, etc.).

2. High-level languages need more design than low-level languages

The core team for open-source language design is usually very small and therefore tends to focus on a minimal set of low-level language constructs to support the language’s key concepts. Higher-level concepts are then delegated to the competing developers of libraries, who design independently of each other or the core language team.

Wolfram’s vision of a computational language is the opposite of this approach. We believe in a language that focuses on delivering the full set of standardized high-level constructs that allows you to express ideas to the computer more quickly, with less code, in a literate, human-readable way. Only centralized design and centralized control can achieve this in a coherent and consistent way.

Practical benefits of this include:

  • One language to learn for all coding domains (computation, data science, interface building, system integration, reporting, process control, etc.)—enabling integrated workflows for which these are converging.
  • Code that is on average seven times shorter than Python, six times shorter than Java, three times shorter than R .
  • Code that is readable by both humans and machines.
  • Minimal dependencies (no collections of competing libraries from different sources with independent and shifting compatibility).

3. You need multidisciplinary teams to unify disparate fields

Self-assembling development teams tend to rally around a single topic and so tend to come from the same community. As a result, one sees many open-source tools tackle only a single computational domain. You see statistics packages, machine learning libraries, image processing libraries—and the only open-source attempts to unify domains are limited to pulling together collections of these single-domain libraries and adding a veneer of connectivity. Unifying different fields takes more than this.

Because Wolfram is large and diverse enough to bring together people from many different fields, it can take on the centralized design challenge of finding the common tasks, workflows and computations of those different fields. Centralized decision making can target new domains and professionally recruit the necessary domain experts, rather than relying on them to identify the opportunity for themselves and volunteer their time to a project that has not yet touched their field.

Practical benefits of this include:

  • Provides a common language across domains including statistics, optimization, graph theory, machine learning, time series, geometry, modeling and many more.
  • Provides a common language for engineers, data scientists, physicists, financial engineers and many more.
  • Tasks that cross different data and computational domains are no harder than domain-specific tasks.
  • Engaged with emergent fields such as blockchain.

4. Hard cases and boring stuff need to get done too

Much of the perceived success of open-source development comes from its access to “volunteer developers.” But volunteers tend to be drawn to the fun parts of projects—building new features that they personally want or that they perceive others need. While this often starts off well and can quickly generate proof-of-concept tools, good software has a long tail of less glamorous work that also needs to be done. This includes testing, debugging, writing documentation (both developer and user), relentlessly refining user interfaces and workflows, porting to a multiplicity of platforms and optimizing across them. Even when the work is done, there is a long-term liability in fixing and optimizing code that breaks as dependencies such as the operating system change over many years.

While it would not be impossible for a FOSS project to do these things well, the commercially funded approach of having paid employees directed to deliver good end-user experience does, over the long term, a consistently better job on this “final mile” of usability than relying on goodwill.

Some practical benefits of this include:

  • Tens of thousands of pages of consistently and highly organized documentation with over 100,000 examples.
  • The most unified notebook interface in the world, unifying exploration, code development, presentation and deployment workflows in a consistent way.
  • Write-once deployment over many platforms both locally and in the cloud.

5. Crowd-sourced decisions can be bad for you

While bad leadership is always bad, good leadership is typically better than compromises made in committees.

Your choice of computational tool is a serious investment. You will spend a lot of time learning the tool, and much of your future work will be built on top of it, as well as having to pay any license fees. In practice, it is likely to be a long-term decision, so it is important that you have confidence in the technology’s future.

Because open-source projects are directed by their contributors, there is a risk of hijacking by interest groups whose view of the future is not aligned with yours. The theoretical safety net of access to source code can compound the problem by producing multiple forks of projects, so that it becomes harder to share your work as communities are divided between competing versions.

While the commercial model does not guarantee protection from this issue, it does guarantee a single authoritative version of technology and it does motivate management to be led by decisions that benefit the majority of its users over the needs of specialist interests.

In practice, if you look at Wolfram Research’s history, you will see:

  • Ongoing development effort across all aspects of the Wolfram technology stack.
  • Consistency of design and compatibility of code and documents over 30 years.
  • Consistency of prices and commercial policy over 30 years.

6. Our developers work for you, not just themselves

Many open-source tools are available as a side effect of their developers’ needs or interests. Tools are often created to solve a developer’s problem and are then made available to others, or researchers apply for grants to explore their own area of research and code is made available as part of academic publication. Figuring out how other people want to use tools and creating workflows that are broadly useful is one of those long-tail development problems that open source typically leaves to the user to solve.

Commercial funding models reverse this motivation. Unless we consider the widest range of workflows, spend time supporting them and ensure that algorithms solve the widest range of inputs, not just the original motivating ones, people like you will not pay for the software. Only by listening to both the developers’ expert input and the commercial teams’ understanding of their customers’ needs and feedback is it possible to design and implement tools that are useful to the widest range of users and create a product that is most likely to sell well. We don’t always get it right, but we are always trying to make the tool that we think will benefit the most people, and is therefore the most likely to help you.

Practical benefits include:

7. Unified computation requires unified design

Complete integration of computation over a broad set of algorithms creates significantly more design than simply implementing a collection of independent algorithms.

Design coherence is important for enabling different computations to work together without making the end user responsible for converting data types, mapping functional interfaces or rethinking concepts by having to write potentially complex bridging code. Only design that transcends a specific computational field and the details of computational mechanics makes accessible the power of the computations for new applications.

The typical unmanaged, single-domain, open-source contributors will not easily bring this kind of unification, however knowledgeable they are within their domain.

Practical benefits of this include:

  • Avoids costs of switching between systems and specifications (having to write excessive glue code to join different libraries with different designs).
  • Immediate access to unanticipated functions without stopping to hunt for libraries.
  • Wolfram developers can get the same benefits of unification as they create more sophisticated implementations of new functionality by building on existing capabilities.
  • TheWolfram Language’s task-oriented design allows your code to benefit from new algorithms without having to rewrite it.

8. Unified representation requires unified design

Computation isn’t the only thing that Wolfram is trying to unify. To create productive tools, it is necessary to unify the representation of disparate elements involved in a computational workflow: many types of rich data, documents, interactivity, visualizations, programs, deployments and more. A truly unified computational representation enables abstraction above each of these individual elements, enabling new levels of conceptualization of solutions as well as implementing more traditional approaches.

The open-source model of bringing separately conceived, independently implemented projects together is the antithesis of this approach—either because developers design representations around a specific application that are not rich enough to be applied in other applications, or if they are widely applicable, they only tackle a narrow slice of the workflow.

Often the consequence is that data interchange is done in the lowest common format, such as numerical or textual arrays—often the native types of the underlying language. Associated knowledge is discarded; for example, that the data represents a graph, or that the values are in specific units, or that text labels represent geographic locations, etc. The management of that discarded knowledge, the coercion between types and the preparation for computation must be repeatedly managed by the user each time they apply a different kind of computation or bring a new open-source tool into their toolset.

Practical examples of this include:

  • The Wolfram Language can use the same operations to create or transform many types of data, documents, interfaces and even itself.
  • Wolfram machine learning tools automatically accept text, sounds, images and numeric and categorical data.
  • As well as doing geometry calculations, the geometric representations in the Wolfram Language can be used to constrain optimizations, define regions of integration, control the envelope of visualizations, set the boundary values for PDE solvers, create Unity game objects and generate 3D prints.

9. Open source doesn’t bring major tech innovation to market

FOSS development tends to react to immediate user needs—specific functionality, existing workflows or emulation of existing closed-source software. Major innovations require anticipating needs that users do not know they have and addressing them with solutions that are not constrained by an individual’s experience.

As well as having a vision beyond incremental improvements and narrowly focused goals, innovation requires persistence to repeatedly invent, refine and fail until successful new ideas emerge and are developed to mass usefulness. Open source does not generally support this persistence over enough different contributors to achieve big, market-ready innovation. This is why most large open-source projects are commercial projects, started as commercial projects or follow and sometimes replicate successful commercial projects.

While the commercial model certainly does not guarantee innovation, steady revenue streams are required to fund the long-term effort needed to bring innovation all the way to product worthiness. Wolfram has produced key innovations over 30 years, not least having led the concept of computation as a single unified field.

Open source often does create ecosystems that encourage many small-scale innovations, but while bolder innovations do widely exist at the early experimental stages, they often fail to be refined to the point of usefulness in large-scale adoption. And open-source projects have been very innovative at finding new business models to replace the traditional, paid-product model.

Other examples of Wolfram innovation include:

  • Wolfram invented the computational notebook, which has been partially mirrored by Jupyter and others.
  • Wolfram invented the concept of automated creation of interactive components in notebooks with its Manipulate function (also now emulated by others).
  • Wolfram develops automatic algorithm selection for all task-oriented superfunctions ( Predict , Classify , NDSolve , Integrate , NMinimize , etc.).

10. Paid software offers an open quid pro quo

Free software isn’t without cost. It may not cost you cash upfront, but there are other ways it either monetizes you or that it may cost you more later. The alternative business models that accompany open source and the deferred and hidden costs may be suitable for you, but it is important to understand them and their effects. If you don’t think about the costs or believe there is no cost, you will likely be caught out later.

While you may not ideally want to pay in cash, I believe that for computation software, it is the most transparent quid pro quo.

“Open source” is often simply a business model that broadly falls into four groups:

Freemium:The freemium model of free core technology with additional paid features (extra libraries and toolboxes, CPU time, deployment, etc.) often relies on your failure to predict your longer-term needs. Because of the investment of your time in the free component, you are “in too deep” when you need to start paying. The problem with this model is that it creates a motivation for the developer toward designs that appear useful but withhold important components, particularly features that matter in later development or in production, such as security features.

Commercial traps:The commercial trap sets out to make you believe that you are getting something for free when you are not. In a sense, the Freemium model sometimes does this by not being upfront about the parts that you will end up needing and having to pay for. But there are other, more direct traps, such as free software that uses patented technology. You get the software for free, but once you are using it they come after you for patent fees. Another common trap is free software that becomes non-free, such as recent moves with Java, or that starts including non-free components that gradually drive a wedge of non-free dependency until the supplier can demand what they want from you.

User exploitation:Various forms of business models center on extracting value from you and your interactions. The most common are serving you ads, harvesting data from you or giving you biased recommendations. The model creates a motivation to design workflows to maximize the hidden benefit, such as ways to get you to see more ads, to reveal more of your data or to sell influence over you. While not necessarily harmful, it is worth trying to understand how you are providing hidden value and whether you find that acceptable.

Free by side effect:Software is created by someone for their own needs, which they have no interest in commercializing or protecting. While this is genuinely free software, the principal motivation of the developer is their own needs, not yours. If your needs are not aligned, this may produce problems in support or development directions. Software developed by research grants has a similar problem. Grants drive developers to prioritize impressing funding bodies who provide grants more than impressing the end users of the software. With most research grants being for fixed periods, they also drive a focus on initial delivery rather than long-term support. In the long run, misaligned interests cost you in the time and effort it takes you to adapt the tool to your needs or to work around its developers’ decisions. Of course, if your software is funded by grants or by the work of publicly funded academics and employees, then you are also paying through your taxes—but I guess there is no avoiding that!

In contrast, the long-term commercial model that Wolfram chooses motivates maximizing the usefulness of the development to the end users, who are directly providing the funding, to ensure that they continue to choose to fund development through upgrades or maintenance. The model is very direct and upfront. We try to persuade you to buy the software by making what we think you want, and you pay to use it. The users who make more use of it generally are the ones who pay more. No one likes paying money, but it is clear what the deal is and it aligns our interest with yours.

Now, it is clearly true that many commercial companies producing paid software have behaved very badly and have been the very source of the “vendor lock-in” fear that makes open source appealing. Sometimes that stems from misalignment of management’s short-term interest to their company’s long-term interests, sometimes just because they think it is a good idea. All I can do is point to Wolfram history, and in 30 years we have kept prices and licensing models remarkably stable (though every year you get more for your money) and have always avoided undocumented, encrypted and non-exportable data and document formats and other nasty lock-in tricks. We have always tried to be indispensable rather than “locked in.”

In all cases, code is free only when the author doesn’t care, because they are making their money somewhere else. Whatever the commercial and strategic model is, it is important that the interests of those you rely on are aligned with yours.

Some benefits of our choice of model have included:

  • An all-in-one technology stack that has everything you need for a given task.
  • No hidden data gathering and sale or external advertising.
  • Long-term development and support.

11. It takes steady income to sustain long-term R&D

Before investing work into a platform, it is important to know that one is backing the right technology not just for today but into the future. You want your platform to incrementally improve and to keep up with changes in operating systems, hardware and other technologies. This takes sustained and steady effort and that requires sustained and steady funding.

Many open-source projects with their casual contributors and sporadic grant funding cannot predict their capacity for future investment and so tend to focus on short-term projects. Such short bursts of activity are not sufficient to bring large, complex or innovative projects to release quality.

While early enthusiasm for an open-source project often provides sufficient initial effort, sustaining the increased maintenance demand of a growing code base becomes increasingly problematic. As projects grow in size, the effort required to join a project increases. It is important to be able to motivate developers through the low-productivity early months, which, frankly, are not much fun. Salaries are a good motivation. When producing good output is no longer personally rewarding, open-source projects that rely on volunteers tend to stall.

A successful commercial model can provide the sustained and steady funding needed to make sure that the right platform today is still the right platform tomorrow.

You can see the practical benefit of steady, customer-funded investment in Wolfram technology:

  • Regular feature or maintenance upgrades for over 30 years.
  • Cross-platform support maintained throughout our history.
  • Multi-year development projects as well as short-term projects.

12. Bad design is expensive

Much has been written about how total cost of ownership of major commercial software is often lower than free open-source software, when you take into account productivity, support costs, training costs, etc. While I don’t have the space here to argue that out in full, I will point out that nowhere are those arguments more true than in unified computation. Poor design and poor integration in computation result in an explosion of complexity, which brings with it a heavy price for usability, productivity and sustainability.

Every time a computation chokes on input that is an unacceptable type or out of acceptable range or presented in the wrong conceptualization, that is a problem for you to solve; every time functionality is confusing to use because the design was a bit muddled and the documentation was poor, you spend more of your valuable life staring at the screen. Generally speaking, the users of technical software are more expensive people who are trying to produce more valuable outputs, so wasted time in computation comes at a particularly high cost.

It’s incredibly tough to keep the Wolfram Language easy to use and have functions “just work” as its capabilities continue to grow so rapidly. But Wolfram’s focus on global design ( see it in action ) together with high effort on the final polish of good documentation and good user interface support has made it easier and more productive than many much smaller systems.

Summary: Not being open source makes the Wolfram Language possible

As I said at the start, the open-source model can work very well in smaller, self-contained subsets of computation where small teams can focus on local design issues. Indeed, the Wolfram Technology stack makes use of and contributes to a number of excellent open-source libraries for specialized tasks, such as MXNet (neural network training), GMP (high-precision numeric computation), LAPACK (numeric linear algebra) and for many of the 185 import/export formats automated behind the Wolfram Language commands Import and Export . Where it makes sense, we make self-contained projects open source, such as the Wolfram Demonstrations Project , the new Wolfram Function Repository and components such as the Client Library for Python .

But our vision is a grand one—unify all of computation into a single coherent language, and for that, the FOSS development model is not well suited.

The central question is, How do you organize such a huge project and how do you fund it so that you can sustain the effort required to design and implement it coherently? Licenses and prices are details that follow from that. By creating a company that can coordinate the teams tightly and by generating a steady income by selling tools that customers want and are happy to pay for, we have been able to make significant progress on this challenge, in a way that is designed ready for the next round of development. I don’t believe it would have been possible using an open-source approach and I don’t believe that the future we have planned will be either.

This does rule out exposing more of our code for inspection. However, right now, large amounts of code are visible (though not conveniently or in a publicized way) but few people seem to care. It is hard to know if the resources needed to improve code access, for the few who would make use of it, are worth the cost to everyone else.

Perhaps I have missed some reasons, or you disagree with some of my rationale. Leave a comment below and I will try and respond.

About Joyk

Aggregate valuable and interesting links.
Joyk means Joy of geeK