Tumblelog by Soup.io
Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

June 09 2017

free software - Bits of Freedom: My Debian Application (anno 1998)

Digging around in some files, in a dark corner of my hard drive, I found my application as a Debian Developer, sent the 18th of April 1998:

Hi,

this is a request for a registration as an official Debian developer. I'm currently working with the Sparc port of Debian and expect to make sure that all Debian packages compiles and is released for Sparc. Attached below is my PGP key. I have read the Debian Social Contract and have also retrieved and looked at the Debian Policy Manual and the Debian Packaging Manual. I'd like to be subscribed to debian-private as jonas@coyote.org, which is my primary mail address. My preferred login name on master is 'jonas', but 'oberg' will work aswell if the preferred login name is already taken. Also attached below is a /signed/ version of my PGP Public key by debian developer Björn Brenander.

Thank you,

Jonas

free software - Bits of Freedom: Is this the end of decentralisation (Revisited)?

In May last year, I wrote an article in which I argued for the need for interoperability trumping the need for centralisation over time. The background to the article was Signal, a secure messaging app built on a centralised platform, and the critique which it had to endure for its decision to stay centralised.

My bet, one year ago, was that by this time, the world would have moved a little bit more towards a federated structure. Are we any closer to it? I would be hesitant to say so. The messaging protocols in use continue to be dominated by WhatsApp, Telegram, Facebook and Signal, neither of which is showing signs of moving towards a federated structure yet.

What might be some indication of convergence is that WhatsApp, Facebook as well as Google Allo, are using the Signal Protocol, even if non-federated using their own servers. So out of the four most commonly used messaging platforms, three are using the same protocol with only Telegram being the outlier.

That doesn't mean federation, but I'll give myself a yellow light for my prediction a year ago. We haven't moved as far as I would have hoped towards federation, but there's been some convergence in the market which may give some indication for what is to come.

Let's set a new reminder for June 2018!

June 07 2017

Henri Bergius: Reprogramming the hackerspace: MsgFlo IoT workshops at c-base and Bitraf

This July we’re organizing two hack weekends around MsgFlo and Flowhub:

Both of these will focus on reprogramming the Internet of Things setup of the hackerspace. The aim is to connect more devices at the spaces to the MsgFlo environment, and come up with new connections between systems, and ways to visualize them.

If you’re interested, feel welcome to join! Bring your own laptop. For the Bitraf event, please register on the Meetup page.

c-base disco mode

Focus areas

  • MsgFlo IoT and new functionality — connecting internet services with MsgFlo, adding new smarts to the hackerspace IoT setup. Skills: Python, Node.js, Rust
  • Hardware hacking: — connecting more devices with MsgFlo. Skills: microcontroller programming, electronics
  • Information displays — new infoscreen designs, data visualizations. Skills: web design, React.js, Django
  • Mobile app — bringing the hackerspace IoT functionality to mobile. Skills: Android
  • Woodworking: — new cases, mounts, decorations for various systems. Skills: woodworking, painting

You don’t have to an expert to participate. We’ll be there to help you get up to speed!

More information

c-flo station at c-base

June 06 2017

Paul Boddie's Free Software-related blog » English: Some Thoughts on Python-Like Languages

A few different things have happened recently that got me thinking about writing something about Python, its future, and Python-like languages. I don’t follow the different Python implementations as closely as I used to, but certain things did catch my attention over the last few months. But let us start with things closer to the present day.

I was neither at the North American PyCon event, nor at the invitation-only Python Language Summit that occurred as part of that gathering, but LWN.net has been reporting the proceedings to its subscribers. One of the presentations of particular interest was covered by LWN.net under the title “Keeping Python competitive”, apparently discussing efforts to “make Python faster”, the challenges faced by different Python implementations, and the limitations imposed by the “canonical” CPython implementation that can frustrate performance improvement efforts.

Here is where this more recent coverage intersects with things I have noticed over the past few months. Every now and again, an attempt is made to speed Python up, sometimes building on the CPython code base and bolting on additional technology to boost performance, sometimes reimplementing the virtual machine whilst introducing similar performance-enhancing technology. When such projects emerge, especially when a large company is behind them in some way, expectations of a much faster Python are considerable.

Thus, when the Pyston reimplementation of Python became more widely known, undertaken by people working at Dropbox (who also happen to employ Python’s creator Guido van Rossum), people were understandably excited. Three years after that initial announcement, however, and those ambitious employees now have to continue that work on their own initiative. One might be reminded of an earlier project, Unladen Swallow, which also sought to perform just-in-time compilation of Python code, undertaken by people working at Google (who also happened to employ Python’s creator Guido van Rossum at the time), which was then abandoned as those people were needed to go and work on other things. Meanwhile, another apparently-broadly-similar project, Pyjion, is being undertaken by people working at Microsoft, albeit as a “side project at work”.

As things stand, perhaps the most dependable alternative implementation of Python, at least if you want one with a just-in-time compiler that is actively developed and supported for “production use”, appears to be PyPy. And this is only because of sustained investment of both time and funding over the past decade and a half into developing the technology and tracking changes in the Python language. Heroically, the developers even try and support both Python 2 and Python 3.

Motivations for Change

Of course, Google, Dropbox and Microsoft presumably have good reasons to try and get their Python code running faster and more efficiently. Certainly, the first two companies will be running plenty of Python to support their services; reducing the hardware demands of delivering those services is definitely a motivation for investigating Python implementation improvements. I guess that there’s enough Python being run at Microsoft to make it worth their while, too. But then again, none of these organisations appear to be resourcing these efforts at anything close to what would be marshalled for their actual products, and I imagine that even similar infrastructure projects originating from such companies (things like Go, for example) have had many more people assigned to them on a permanent basis.

And now, noting the existence of projects like Grumpy – a Python to Go translator – one has to wonder whether there isn’t some kind of strategy change afoot: that it now might be considered easier for the likes of Google to migrate gradually to Go and steadily reduce their dependency on Python than it is to remedy identified deficiencies with Python. Of course, the significant problem remains of translating Python code to Go and still have it interface with code written in C against Python’s extension interfaces, maintaining reliability and performance in the result.

Indeed, the matter of Python’s “C API”, used by extensions written in C for Python programs to use, is covered in the LWN.net article. As people have sought to improve the performance of their software, they have been driven to rewrite parts of it in C, interfacing these performance-critical parts with the rest of their programs. Although such optimisation techniques make sense and have been a constant presence in software engineering more generally for many decades, it has almost become the path of least resistance when encountering performance difficulties in Python, even amongst the maintainers of the CPython implementation.

And so, alternative implementations need to either extract C-coded functionality and offer it in another form (maybe even written in Python, can you imagine?!), or they need to have a way of interfacing with it, one that could produce difficulties and impair their own efforts to deliver a robust and better-performing solution. Thus, attempts to mitigate CPython’s shortcomings have actually thwarted the efforts of other implementations to mitigate the shortcomings of Python as a whole.

Is “Python” Worth It?

You may well be wondering, if I didn’t manage to lose you already, whether all of these ambitious and brave efforts are really worth it. Might there be something with Python that just makes it too awkward to target with a revised and supposedly better implementation? Again, the LWN.net article describes sentiments that simpler, Python-like languages might be worth considering, mentioning the Hack language in the context of PHP, although I might also suggest Crystal in the context of Ruby, even though the latter is possibly closer to various functional languages and maybe only bears syntactic similarities to Ruby (although I haven’t actually looked too closely).

One has to be careful with languages that look dynamic but are really rather strict in how types are assigned, propagated and checked. And, should one choose to accept static typing, even with type inference, it could be said that there are plenty of mature languages – OCaml, for instance – that are worth considering instead. As people have experimented with Python-like languages, others have been quick to criticise them for not being “Pythonic”, even if the code one writes is valid Python. But I accept that the challenge for such languages and their implementations is to offer a Python-like experience without frustrating the programmer too much about things which look valid but which are forbidden.

My tuning of a Python program to work with Shedskin needed to be informed about what Shedskin was likely to allow and to reject. As far as I am concerned, as long as this is not too restrictive, and as long as guidance is available, I don’t see a reason why such a Python-like language couldn’t be as valid as “proper” Python. Python itself has changed over the years, and the version I first used probably wouldn’t measure up to what today’s newcomers would accept as Python at all, but I don’t accept that the language I used back in 1995 was not Python: that would somehow be a denial of history and of my own experiences.

Could I actually use something closer to Python 1.4 (or even 1.3) now? Which parts of more recent versions would I miss? And which parts of such ancient Pythons might even be superfluous? In pursuing my interests in source code analysis, I decided to consider such questions in more detail, partly motivated by the need to keep the investigation simple, partly motivated by laziness (that something might be amenable to analysis but more effort than I considered worthwhile), and partly motivated by my own experiences developing Python-based solutions.

A Leaner Python

Usually, after a title like that, one might expect to read about how I made everything in Python statically typed, or that I decided to remove classes and exceptions from the language, or do something else that would seem fairly drastic and change the character of the language. But I rather like the way Python behaves in a fundamental sense, with its classes, inheritance, dynamic typing and indentation-based syntax.

Other languages inspired by Python have had a tendency to diverge noticeably from the general form of Python: Boo, Cobra, Delight, Genie and Nim introduce static typing and (arguably needlessly) change core syntactic constructs; Converge and Mython focus on meta-programming; MyPy is the basis of efforts to add type annotations and “optional static typing” to Python itself. Meanwhile, Serpentine is a project being developed by my brother, David, and is worth looking at if you want to write software for Android, have some familiarity with user interface frameworks like PyQt, and can accept the somewhat moderated type discipline imposed by the Android APIs and the Dalvik runtime environment.

In any case, having already made a few rounds trying to perform analysis on Python source code, I am more interested in keeping the foundations of Python intact and focusing on the less visible characteristics of programs: effectively reading between the lines of the source code by considering how it behaves during execution. Solutions like Shedskin take advantage of restrictions on programs to be able to make deductions about program behaviour. These deductions can be sufficient in helping us understand what a program might actually do when run, as well as helping the compiler make more robust or efficient programs.

And the right kind of restrictions might even help us avoid introducing more disruptive restrictions such as having to annotate all the types in a program in order to tell us similar things (which appears to be one of the main directions of Python in the current era, unfortunately). I would rather lose exotic functionality that I have never really been convinced by, than retain such functionality and then have to tell the compiler about things it would otherwise have a chance of figuring out for itself.

Rocking the Boat

Certainly, being confronted with any list of restrictions, despite the potential benefits, can seem like someone is taking all the toys away. And it can be difficult to deliver the benefits to make up for this loss of functionality, no matter how frivolous some of it is, especially if there are considerable expectations in terms of things like performance. Plenty of people writing alternative Python implementations can attest to that. But there are other reasons to consider a leaner, more minimal, Python-like language and accompanying implementation.

For me, one rather basic reason is merely to inform myself about program analysis, figure out how difficult it is, and hopefully produce a working solution. But beyond that is the need to be able to exercise some level of control over the tools I depend on. Python 2 will in time no longer be maintained by the Python core development community; a degree of agitation has existed for some time to replace it with Python 3 in Free Software operating system distributions. Yet I remain unconvinced about Python 3, particularly as it evolves towards a language that offers “optional” static typing that will inevitably become mandatory (despite assertions that it will always officially be optional) as everyone sprinkles their code with annotations and hopes for the magic fairies and pixies to come along and speed it up, that latter eventuality being somewhat less certain.

There are reasons to consider alternative histories for Python in the form of Python-like languages. People argue about whether Python 3′s Unicode support makes it as suitable for certain kinds of programs as Python 2 has been, with the Mercurial project being notable in its refusal to hurry along behind the Python 3 adoption bandwagon. Indeed, PyPy was devised as a platform for such investigations, being only somewhat impaired in some respects by its rather intensive interpreter generation process (but I imagine there are ways to mitigate this).

Making a language implementation that is adaptable is also important. I like the ability to be able to cross-compile programs, and my own work attempts to make this convenient. Meanwhile, cross-building CPython has been a struggle for many years, and I feel that it says rather a lot about Python core development priorities that even now, with the need to cross-build CPython if it is to be available on mobile platforms like Android, the lack of a coherent cross-building strategy has left those interested in doing this kind of thing maintaining their own extensive patch sets. (Serpentine gets around this problem, as well as the architectural limitations of dropping CPython on an Android-based device and trying to hook it up with the different Android application frameworks, by targeting the Dalvik runtime environment instead.)

No Need for Another Language?

I found it depressingly familiar when David announced his Android work on the Python mobile-sig mailing list and got the following response:

In case you weren't aware, you can just write Android apps and services
in Python, using Kivy.  No need to invent another language.

Fortunately, various other people were more open-minded about having a new toolchain to target Android. Personally, the kind of “just use …” rhetoric reminds me of the era when everyone writing Web applications in Python were exhorted to “just use Zope“, which was a complicated (but admittedly powerful and interesting) framework whose shortcomings were largely obscured and downplayed until enough people had experienced them and felt that progress had to be made by working around Zope altogether and developing other solutions instead. Such zero-sum games – that there is one favoured approach to be promoted, with all others to be terminated or hidden – perhaps inspired by an overly-parroted “only one way to do it” mantra in the Python scene, have been rather damaging to both the community and to the adoption of Python itself.

Not being Python, not supporting every aspect of Python, has traditionally been seen as a weakness when people have announced their own implementations of Python or of Python-like languages. People steer clear of impressive works like PyPy or Nuitka because they feel that these things might not deliver everything CPython does, exactly like CPython does. Which is pretty terrible if you consider the heroic effort that the developer of Nuitka puts in to make his software work as similarly to CPython as possible, even going as far as to support Python 2 and Python 3, just as the PyPy team do.

Solutions like MicroPython have got away with certain incompatibilities with the justification that the target environment is rather constrained. But I imagine that even that project’s custodians get asked whether it can run Django, or whatever the arbitrarily-set threshold for technological validity might be. Never mind whether you would really want to run Django on a microcontroller or even on a phone. And never mind whether large parts of the mountain of code propping up such supposedly essential solutions could actually do with an audit and, in some cases, benefit from being retired and rewritten.

I am not fond of change for change’s sake, but new opportunities often bring new priorities and challenges with them. What then if Python as people insist on it today, with all the extra features added over the years to satisfy various petitioners and trends, is actually the weakness itself? What if the Python-like languages can adapt to these changes, and by having to confront their incompatibilities with hastily-written code from the 1990s and code employing “because it’s there” programming techniques, they can adapt to the changing environment while delivering much of what people like about Python in the first place? What if Python itself cannot?

“Why don’t you go and use something else if you don’t like what Python is?” some might ask. Certainly, Free Software itself is far more important to me than any adherence to Python. But I can also choose to make that other language something that carries forward the things I like about Python, not something that looks and behaves completely differently. And in doing so, at least I might gain a deeper understanding of what matters to me in Python, even if others refuse the lessons and the opportunities such Python-like languages can provide.

vanitasvitae's blog » englisch: Smack v4.2 Introduces OMEMO Support!

This blogpost doubles as a GSoC update, as well as a version release blog post.

OMEMO Clownfish logo.

OMEMO Clownfish logo (conversations.im)

I have the honour to announce the latest release of Smack! Version 4.2 brings among bug fixes and additional features like Explicit Message Encryption (XEP-0380) and Message Processing Hints (XEP-0334) support for OMEMO Multi-End-Message-and-Object encryption (XEP-0384). OMEMO was developed by Andreas Straub for the Conversations messenger (also as a Google Summer of Code project) in 2015. Since then it got quite popular and drew a lot of attention for XMPP in the media. My hope is that my efforts to develop an easy to use Smack module will result in an even broader adoption.

OMEMO is a protocol for multi-end to multi-end encrypted communication, which utilizes the so called Double Ratchet algorithm. It fulfills amongst the basic requirements of encrypted communication (confidentiality, authenticity and integrity) also the properties of deniability and forward secrecy as well as future secrecy. Smacks implementation brings support for encrypted single and group chats including identity management and session renegotiation.

Current implementations (as well as this one) are based upon the libsignal library developed by OpenWhisperSystems for their popular Signal (formerly TextSecure) messenger. Smacks OMEMO support is structured in two modules. There is smack-omemo (APL licensed), which contains the logic specified in the XEP, as well as some basic cryptographic code. The other module smack-omemo-signal (GPLv3 licensed) implements some abstract methods defined by smack-omemo and encapsulates all function calls to libsignal.

Currently smack-omemo-signal is the only module available that implements the double ratchet functionality, but there has been a lot of discussion on the XMPP Standards Foundations mailing list regarding the use of alternative (more permissively licensed) libraries for OMEMO (like for example Olm, a double ratchet implementation from our friends over at the [matrix] project). So once there is a new specification that enables the use of other libraries, it should be pretty easy to write another module for smack-omemo enabling OMEMO support for clients that are not GPLv3 compatible as well.

Smack’s OMEMO modules are my first bigger contribution to a free software project and started as part of my bachelors thesis. I’m quite happy with the outcome :)

Smack Logo

Also Smack has a new Logo!

That was a lot of talking about OMEMO. Now comes the second functioning of this blog post, my GSoC update.

My project of implementing Jingle File Transfer (XEP-0234) for Smack is going relatively well. I’m stuck at some points where there are ambiguities in the XEP or things I don’t know yet, but most of the time I find another construction site where I can continue my work. Currently I’m implementing stanza providers and elements needed for file transfer. Along the way I steadily create Junit tests to keep the code coverage at a high level. Already it pays off when there are fiddly changes in the element structure.

It’s a real pleasure to learn all the tools I never used before like code coverage reports or mocking and I think Flow does a good job introducing me to them one by one.

That’s all for now. Happy hacking :)

June 05 2017

Henri Bergius: NoFlo: six years of JavaScript dataflow

Quite a bit of time has passed since my two years of NoFlo post, and it is time to take another look at the state of the NoFlo ecosystem. To start with the basics, NoFlo is a JavaScript implementation of Flow-Based Programming:

In computer programming, flow-based programming (FBP) is a programming paradigm that defines applications as networks of “black box” processes, which exchange data across predefined connections by message passing, where the connections are specified externally to the processes. These black box processes can be reconnected endlessly to form different applications without having to be changed internally. FBP is thus naturally component-oriented.

With NoFlo software is built by creating graphs that contain reusable components and define the program logic by determining how these components talk to each other.

I started the NoFlo open source project six years ago in Mountain View, California. My aim was to improve the JavaScript programming experience by bringing the FBP paradigm to the ecosystem. At the time the focus was largely on web API servers and extract, transform, load (ETL) programs, but the scope has since expanded quite a bit:

NoFlo is not a web framework or a UI toolkit. It is a way to coordinate and reorganize data flow in any JavaScript application. As such, it can be used for whatever purpose JavaScript can be used for. We know of NoFlo being used for anything from building web servers and build tools, to coordinating events inside GUI applications, driving robots, or building Internet-connected art installations.

Flowhub

Four years ago I wrote how UI was the missing part of NoFlo. Later the same year we launched a Kickstarter campaign to fix this.

Our promise was to design a new way to build software & manage complexity - a visual development environment for all.

<iframe frameborder="0" height="315" scrolling="no" src="https://www.kickstarter.com/projects/noflo/noflo-development-environment/widget/video.html" width="560"> </iframe>

This was wildly successful, being at the time the 5th highest funded software crowdfunding campaign. The result — Flowhub — was released to the public in 2014. Big thanks to all of our backers!

Here is how Fast Company wrote about NoFlo:

If NoFlo succeeds, it could herald a new paradigm of web programming. Imagine a world where anyone can understand and build web applications, and developers can focus on programming efficient components to be put to work by this new class of application architects. In a way, this is the same promise as the “learn to code” movement, which wants to teach everyone to be a programmer. Just without the programming.

With Flowhub you can manage full NoFlo projects in your browser. This includes writing components in JavaScript or CoffeeScript, editing graphs and subgraphs, running and introspecting the software, and creating unit tests. You can keep your project in sync with the GitHub integration.

Live programming NoFlo in Flowhub

Celebrating six years of NoFlo

Earlier this year we incorporated Flowhub in Germany. Now, to celebrate six years of NoFlo we’re offering a perpetual 30% discount on Flowhub plans. To lock in the discount, subscribe to a Flowhub plan before June 12th 2017 using the code noflo6.

Ecosystem

While NoFlo itself has by no means taken over the world yet, the overall ecosystem it is part of is looking very healthy. Sure, JavaScript fatigue is real, but at the same time it has gone through a pretty dramatic expansion.

JavaScript

As I wrote around the time I started NoFlo, JavaScript has indeed become a universal runtime. It is used on web browsers, server-side, as well as for building mobile and desktop applications. And with NoFlo you can target all those platforms with a single programming model and toolchain.

The de-facto standard for sharing JavaScript libraries — NPM has become the most popular software repository for open source modules. Apart from the hundreds of thousands of other packages, you can also get prebuild NoFlo components from NPM to cover almost any use case.

Dataflow

After a long period of semi-obscurity, our Kickstarter campaign greatly increased the awareness in FBP and dataflow programming. Several open source projects expanded the reach of FBP to other platforms, like MicroFlo to microcontroller programming, or PhpFlo to data conversion pipelines in PHP. To support more of these with common tooling, we standardized the FBP protocol that allows IDEs like Flowhub manage flow-based programs across different runtimes.

Dataflow also saw uptake in the bigger industry. Facebook’s Flux architecture brought flow-based programming to reactive web applications. Google’s TensorFlow made dataflow the way to build machine learning applications. And Google’s Cloud Dataflow uses these techniques for stream processing.

Tooling for flow-based programming

One big area of focus for us has been improving the tooling around NoFlo, as well as the other FBP systems. The FBP protocol has been a big enabler for both building better tools, and for collaboration between different FBP and dataflow systems.

FBP protocol

Here are some of the tools currently available for NoFlo developers:

  • Flowhub — browser-based visual programming IDE for NoFlo and other flow-based systems
  • noflo-nodejs — command-line interface for running NoFlo programs on Node.js
  • noflo-browser-app — template for building browser applications in NoFLo
  • MsgFlo — for running NoFlo and other FBP runtimes as a distributed system
  • fbp-specdata-driven tests for NoFlo and other FBP environments
  • flowtrace — tool for retroactive debugging of NoFlo programs. Supports visual replay with Flowhub

NoFlo 0.8

NoFlo 0.8, released in March this year is probably our most important release so far. It introduced a new component API and greatly clarified the component and network lifecycle.

NoFlo 0.8 program lifecycle

With this release, it is easier than ever to build well-behaved NoFlo components and to deal with the mixture of asynchronous and synchronous data processing. It also brings NoFlo a lot closer to the classical FBP concepts.

As part of the release process, we also fully overhauled the NoFlo documentation and wrote a new data transformations tutorial project.

To find out more about NoFlo 0.8, watch my recent NoFlo talk from Berlin Node.js meetup:

<iframe allowfullscreen="" frameborder="0" height="315" src="https://www.youtube.com/embed/x_nhh3yg-Cs?list=PLIuD0578pkZ4Ciu9DNkRMG9yvFrEdVby7" width="560"></iframe>

Road to 1.0

In addition to providing lots of new APIs and functionality, NoFlo 0.8 also acts as the transitional release as we head towards the big 1.0 release. In this version we marked many old APIs as deprecated.

NoFlo 1.0 will essentially be 0.8, but with all the deprecated APIs removed. If you haven’t done so already, now is a good time to upgrade your NoFlo projects to 0.8 and make sure everything runs without deprecation warnings.

We intend to release NoFlo 1.0 later this summer once more of our open source component libraries have been updated to utilize the new features.

Get 30% discount on Flowhub

June 04 2017

Evaggelos Balaskas - System Engineer: DNS Certification Authority Authorization

CAA

Reading RFC 6844 you will find the definition of “DNS Certification Authority Authorization (CAA) Resource Record”.

You can read everything here: RFC 6844

So, what is CAA anyhow?

Certificate Authority

In a nutshell you are declaring which your Certificate Authority is for your domain.

It’s another way to verify that the certificate your site is announcing is in fact signed by the issuer that the certificate is showing.

So let’s see what my certificate is showing:

balaskas_letsencrypt.jpg

DNS

Now, let’s find out what my DNS is telling us:

# dig caa balaskas.gr 

;; ANSWER SECTION:
balaskas.gr.        5938    IN  CAA 1 issue "letsencrypt.org"

Testing

You can also use the Qualys ssl server test:

https://www.ssllabs.com/ssltest/

balaskas_qualys.jpg

Tag(s): dns, CAA, letsencrypt

FSFE Fellowship Vienna » English: FSFE information booth at Linuxwochen Wien and Veganmania MQ

Gespräche am Infostand
Laufend intensive Beratungsgespräche am Infostand

We organised an FSFE information booth on Linuxwochen Wien from 4 to 6 of May and at Veganmania at the MQ in Vienna from 24 to 27 May. Like every year it went very well and especially at Veganmania we could reach many people not yet familiar with free software. Since during the Veganmania there was a Wikipedia event in Vienna at the same time we even encountered some people from all over the world. For example an FSF activist from Boston in the US.

We had re-stocked our leaflets with new versions of some well received handouts we had in the past and we put the new leaflets on free software programs for specific tasks from the Munich group to good use.

Even if we didn’t have much diversity from volunteers we managed to keep our information desk open to visitors who wanted to ask questions for the whole time the events had opening hours. In some cases we were the last booth to close since we had engaged consultations going on.

At Linuxwochen Wien in addition a local volunteer hosted a well attended 3 hour workshop on image editing with GIMP and an other one for creating new maps in Trigger Rally.

Especially the GIMP workshop did attract many people and there is a clear demand for follow-ups not only on GIMP but on other free designing programs also.

It is noticeable that more an more people are aware of free software and do use it on purpose. If this slight and slow shift is related to our outreach work is uncertain but it is for sure a welcome observation.

From our point of view the most important reason why free software is not the default but still an exotic exception is the fact that it almost never comes pre-installed with new hardware – at least not on laptops or desktop machines. Many people understand this instantly as soon as they are told about common business practices where big corporations do offer better conditions for resellers if they sell their software on all products exclusively. This is almost never an advantage to the customers but profits are usually more important as customers satisfaction and most people are just unaware of the tight grip in which corporations keep them. Sticking with certain products is rarely about satisfaction. Most of the time it would just be to burdensome to try something else. And this obstacles are by design. Unfortunately it is hard to give an impression on what people are missing out if they are not even prepared to try something different. Most people are not very happy with the situation but because all their friends and colleagues share the same frustrations they have the impression that there is no better alternative.

Maybe it would be a promising approach to make testimonials from non-technical people satisfied with free software more available to the public …

Evaggelos Balaskas - System Engineer: postfix TLS & ipv6

Postfix

smtp Vs smtpd

smtp.jpg

  • postfix/smtp
    • The SMTP daemon is for sending emails to the Internet (outgoing mail server).
  • postfix/smtpd
    • The SMTP daemon is for receiving emails from the Internet (incoming mail server).

TLS

Encryption on mail transport is what we call: opportunistic. If both parties (sender’s outgoing mail server & recipient’s incoming mail server) agree to exchange encryption keys, then a secure connection may be used. Otherwise a plain connection will be established. Plain as in non-encrypted aka cleartext over the wire.

SMTP - Outgoing Traffic

In the begging there where only three options in postfix:

  • none
  • may
  • encrypt

The default option on a Centos 6x is none:

# postconf -d | grep smtp_tls_security_level
smtp_tls_security_level =

Nowadays, postfix supports more options, like:

  • dane
  • verify
  • secure

Here is the basic setup, to enable TLS on your outgoing mail server:

smtp_tls_security_level = may
smtp_tls_loglevel = 1

From postfix v2.6 and later, can you disable weak encryption by selecting the cipher suite and protocols you prefer to use:

smtp_tls_ciphers = export
smtp_tls_protocols = !SSLv2, !SSLv3

You can also define where the file that holds all the root certificates on your linux server is, and thus to verify the certificate that provides an incoming mail server:

smtp_tls_CAfile = /etc/pki/tls/certs/ca-bundle.crt

I dont recommend to go higher with your setup, cause (unfortunately) not everyone is using TLS on their incoming mail server!

SMTPD - Incoming Traffic

To enable TLS in your incoming mail server, you need to provide some encryption keys aka certificates!

I use letsencrypt on my server and the below notes are based on that.

Let’s Encrypt

A quick explanation on what exists on your letsencrypt folder:

# ls -1 /etc/letsencrypt/live/example.com/

privkey.pem    ===>  You Private Key
cert.pem       ===>  Your Certificate
chain.pem      ===>  Your Intermediate
fullchain.pem  ===>  Your Certificate with Your Intermediate 

Postfix

Below you can find the most basic configuration setup you need for your incoming mail server.

smtpd_tls_ask_ccert = yes
smtpd_tls_security_level = may
smtpd_tls_loglevel = 1

Your mail server is asking for a certificate so that a trusted TLS connection can be established between outgoing and incoming mail server.
The servers must exchange certificates and of course, verify them!

Now, it’s time to present your own domain certificate to the world. Offering only your public certificate cert.pem isnt enough. You have to offer both your certificate and the intermediate’s certificate, so that the sender’s mail server can verify you, by checking the digital signatures on those certificates.

smtpd_tls_cert_file = /etc/letsencrypt/live/example.com/fullchain.pem
smtpd_tls_key_file = /etc/letsencrypt/live/example.com/privkey.pem

smtpd_tls_CAfile = /etc/pki/tls/certs/ca-bundle.crt
smtpd_tls_CApath = /etc/pki/tls/certs

CAfile & CApath helps postfix to verify the sender’s certificate by looking on your linux distribution file, that holds all the root certificates.

And you can also disable weak ciphers and protocols:

smtpd_tls_ciphers = high
smtpd_tls_exclude_ciphers = aNULL, MD5, EXPORT
smtpd_tls_protocols = !SSLv2, !SSLv3

Logs

Here is an example from gmail:

SMTPD - Incoming Mail from Gmail

You can see that there is a trusted TLS connection established From google:

Jun  4 11:52:07 kvm postfix/smtpd[14150]:
        connect from mail-oi0-x236.google.com[2607:f8b0:4003:c06::236]
Jun  4 11:52:08 kvm postfix/smtpd[14150]:
        Trusted TLS connection established from mail-oi0-x236.google.com[2607:f8b0:4003:c06::236]:
        TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)
Jun  4 11:52:09 kvm postfix/smtpd[14150]:
        4516420F32: client=mail-oi0-x236.google.com[2607:f8b0:4003:c06::236]
Jun  4 11:52:10 kvm postfix/smtpd[14150]:
        disconnect from mail-oi0-x236.google.com[2607:f8b0:4003:c06::236]

SMTP - Outgoing Mail from Gmail

And this is the response To gmail :

Jun  4 12:01:32 kvm postfix/smtpd[14808]:
        initializing the server-side TLS engine
Jun  4 12:01:32 kvm postfix/smtpd[14808]:
        connect from example.com[2a00:1838:20:1::XXXX:XXXX]
Jun  4 12:01:33 kvm postfix/smtpd[14808]:
        setting up TLS connection from example.com[2a00:1838:20:1::XXXX:XXXX]
Jun  4 12:01:33 kvm postfix/smtpd[14808]:
        example.com[2a00:1838:20:1::XXXX:XXXX]: TLS cipher list "aNULL:-aNULL:ALL:!EXPORT:!LOW:!MEDIUM:+RC4:@strength:!aNULL:!MD5:!EXPORT:!aNULL"
Jun  4 12:01:33 kvm postfix/smtpd[14808]:
        Anonymous TLS connection established from example.com[2a00:1838:20:1::XXXX:XXXX]:
        TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)
Jun  4 12:01:35 kvm postfix/smtpd[14808]:
        disconnect from example.com[2a00:1838:20:1::XXXX:XXXX]

As you can see -In both cases (sending/receiving)- the mail servers have established a trusted secure TLSv1.2 connection.
The preferred cipher (in both scenarios) is : ECDHE-RSA-AES128-GCM-SHA256

IPv6

Tell postfix to prefer ipv6 Vs ipv4 and use TLS if two mail servers support it !

#IPv6
smtp_address_preference = ipv6
Tag(s): postfix, tls, ipv6

Free Software – Frank Karlitschek_: Is doing nothing evil?

Last weekend I attended the openSUSE conference in Nürnberg. This is a really nice conference. Awesome location, great people, and an overall very relaxed atmosphere. I gave a talk about Nextcloud Security and what we plan to do in the future to make hosting a Nextcloud instance even easier and more secure.

I attended the Saturday keynote which triggered a reaction on my side that I wanted to share. This is only my personal opinion and I’m sure a lot of people think differently. But I can’t help my self.

The talk was about management of infrastructure and automation. It was a really good talk from a technical perspective. Very informative and detailed. But in the middle of the talk, the presenter mentioned that he was involved in building autonomous military submarines. This is of course controversial. I personally wouldn’t get involved in actual weapon development, building things which sole purpose is to kill people. But I understand that people have different opinions here and I can live with such a disagreement.

However, a bit later the presenter mentioned that he also worked for the US intelligence community to build surveillance systems to spy on people on a mass scale. Global, mass scale surveillance, which obviously involves all the people in the room. Which he pointed out as a some kind of joke, noting he might have helped spy on the people in the room.

I’m sorry but I don’t think this is funny at all. The global surveillance systems are undemocratic, in a lot of cases illegal and an attack on the basic human rights of people.

I understand that playing and working with cool technology is fun. And there is a lot of opportunity to do this for secret services and for the military to earn money. But I think we as software developers have a responsibility here. We are building the technology of the future. So we as developers are partly in control of how the world looks like in a few years.

We can’t just say: I close my eyes because this is only a job. Or I don’t want to know how this technology is used. I didn’t ask and no one told me so I’m innocent and not involved. Let me quote a well known saying here: “All that is necessary for the triumph of evil is that good men do nothing.”

I really have a hard time accepting that some people think that building mass surveillance systems is somehow funny or cool. And it is even more troubling to tell this the people your helped put under surveillance into their face and think that this is fun.

Sorry for the rant. But technology matter. Developers matter. Software matters and can be used in good ways and bad ways. We as developers and free software community have a responsibility and should not close our eyes.

June 02 2017

free software - Bits of Freedom: Automating the software toolchain

June 01 2017

Matthias Kirschner's Web log - fsfe: Free Software at Church Days

"Omnis enim res, quae dando non deficit, dum habetur et non datur, nondum habetur, quomodo habenda est."

("For if a thing is not diminished by being shared with others, it is not rightly owned if it is only owned and not shared.")

Saint Augustinus, 397 AD

Conference place for the Churchday in Magdeburg

This Latin quote was written on the FSFE's first t-shirt (see the last picture in this post about Reinhard Müller's work for the FSFE), and was quoted by the moderator Ralf Peter Reimann at a panel discussion during the "Church Days" in Magdeburg. Ralf Peter Reimann is responsible for internet and digital topics at protestant church in the German Rheinland and organised this panel. He guided Harald Geywitz (Telefonica Germany) and myself through a discussion about "Free data, free sharing, being free". The recording of discussion is now online on (in German):

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="315" src="https://www.youtube.com/embed/HtaS4km1vUI" width="560"></iframe>

Afterwards we had an interesting discussion with the audience, and I would like to thank everybody who was involved their. Self-critical I have to say that we probably went to fast into details, so it was difficult for some people in the audience to follow. Definitely something to improve in future discussions.

Furthermore the video is now also listed amongst others on the FSFE's video wiki. Feel free to browse it, watch and share the videos, give feedback about them, as well as add videos with the FSFE's participation which are not mentioned there.

May 30 2017

Matthias Kirschner's Web log - fsfe: Recording for "Limux the loss of a lighthouse"

On 26 May I had the honour to give the keynote at the openSUSE conference. They asked me to talk about the Limux project in Munich. This talk was special talk for me, as in 1999 SUSE 6.0 was my first GNU/Linux distribution and therefore also my start into the Free Software movement. Below you will find the abstract and the recordings of the talk.

Entrance to the venue of the openSUSE conference 2017

Started in 200X the Limux was often cited as the lighthouse project for Free Software in the public administration. Since then we have regularly heard rumours about it. Have they now switched back to proprietary software again or not? Didn't they already migrate back last year? Is it a trend that public administrations aren't using Free Software anymore? Have we failed and is it time to get depressed and stop what we are doing? Do we need new strategies? Those are questions people in our community are confronted with.

We will shed some light on those questions, raise some more, and figure out what we -- as the Free Software community -- can learn from it.

You can either watch or download the video on the CCC's media server or on youtube:

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="315" src="https://www.youtube.com/embed/jMdYxmjq0Vk" width="560"></iframe>

As I mention in the talk, I would be very interested in discussions what we can learn from the Limux project and how we – as the Free Software movement – can improve. If you have comments, please feel free to start the discussion on the FSFE's general English speaking discussion list.

May 28 2017

Blog – Think. Innovation.: No Time To Waste!

How are we going to feed 9+ billion people by 2050? Last week this question was at the heart of the Thought For Food Summit in Amsterdam and I had the honour and pleasure to be giving a Clinic on Opening Business Models and being invited to co-judge on the special Open Business Model Prize in the TFF Challenge Finals.

In a thrilling finals award ceremony team WasteBuster won the special Open Business Model Prize. Michael Kock, head of IP of main sponsor Syngenta, awarded the prize mainly for the potential the WasteBuster concept has for opening up their business model and open licensing their technology.

As the team put forward in their Finals pitch: they want to share the relatively simple and straightforward process of creating high-quality food powder from fruit and vegetable waste under a Creative Commons license, but left out exactly how they want to do this and what they hope to accomplish with it.

In this post I will be sharing my ideas on what WasteBuster could do with open licensing and opening up their business model in order to get a bigger impact faster. Because with mountains of perfectly good fruit and vegetables being thrown away every day, there is no time to waste!

Note: this is an initial thought experiment; I got only a glimpse of what team WasteBuster is up to by talking to them a few minutes and hearing their 5-minute pitch.

The problem that WasteBuster is solving, is this:

“Roughly one third of the food produced in the world for human consumption every year — approximately 1.3 billion tonnes — gets lost or wasted (fao.org).”

Of this food 40% to 50% is root crops, fruits and vegetables, which is 585 million tons a year, equivalent to 1.6 million tons per day, or 18.5 tons every second (that is 18,500 kilos of wasted fruit, vegetables and roots every second).

On of the main reasons that this food is wasted, is that there is not enough time for it to reach people who need it: it goes bad in a couple of days.

What team WasteBuster did is find a way to increase the shelf life of this food from 2 days to 2 years by steaming, drying and pulverizing it using a relatively easy process and machinery. The food powder is then put into portion size bags and labeled and can be sold.

Given this great and proven technology, how do we make sure that it is available to anyone who can benefit from it in the shortest possible time, in a way that is beneficial to all stakeholders involved?

I believe a traditional centralized closed operation where one company is in control of designing, developing, manufacturing and distributing/selling the product is not going to lead to the optimal outcome. This process will take too long, marketing will be an enormous expense and manufacturing, finding resellers and shipping will all make the product overly expensive.

The open alternative could be to make all documentation, descriptions, tutorials, building plans and designs available under a Creative Commons By Attribution Share-Alike license (CC-BY-SA). And the product should be designed in such a way that it can be built relatively easy and reliably using locally sourced materials.

The process of prototyping, testing, improving the design could also be done in an open way, initiating and building a community of people solving the same problem. The driving force of this community will be the WasteBuster team, tirelessly improving the process and machinery and communicating with, involving and interacting with the community in every step.

Local food-waste problem-solvers are then empowered to build their own mini-factory and create a viable system of stakeholders around their practice: farmers delivering the food, local companies supplying machine parts and usables (the bags, labels etc.) and local resellers buying the food powder. These problem-solvers can be supported by the community and are encouraged to share their ideas, findings and improvements on-line.

Of course WasteBuster should also be able to earn revenue to keep investing in improving the process and machinery, disseminating the knowledge and supporting the community (as far as it does not support itself). They can do this in the following ways. If a specialized piece of equipment or part is necessary, they can create and sell that directly online, through partner channels and/or local resellers. If special training is necessary to build, maintain, repair and/or operate the machinery, they can provide online and/or local courses, or train-the-trainer programs with certifications, possibly under a franchise model.

The training materials do not need to be under a CC license, but they could be (I am not sure yet about this one). WasteBuster can charge for licensing and certifications using their brand, which should then be protected under a trademark. The brand then stands for guaranteed quality as it comes from the highly-trusted initiators and WasteBuster can also then give access to their channel network for added value. This is partly how Arduino has become successful using a trademarking model with their Open Source Technology.

WasteBuster can also sell additional items that are needed by local enterprises under their own brand, even the complete machinery, assembled or as a kit, for problem-solvers who do not have the time to construct it, but do have the money to buy it. If WasteBuster is successful, it is likely that ‘clone’ products will come on the market, but this is fine: this is actually the ‘bigger faster impact’ at work and is how amongst others Arduino has become so successful in such a short time.

An open model like this, with the fundamental technology openly licensed for benefit to everyone, the value added materials not being openly licensed and the brand being protected with a trademark, could scale up really fast, making the desired impact in just a few years, without having to deal with the traditional lineair growth problems of production, marketing, sales and distribution, while still creating a profitable, sustainable, healthy company.

The challenge is to make an open system where all stakeholders can benefit: fruit and vegetable growers, waste collectors, waste processors, suppliers of equipment, distributors and buyers of the food powder and last but not least: the WasteBuster co-founders and company.

What do you think? Let’s open up the conversation and support the WasteBuster team in their idea process!

– Diderik

 

May 26 2017

vanitasvitae's blog » englisch: Last week of GSoC Community Bonding

This is my report for the last week of community bonding. On next tuesday the coding phase officially begins \o/.

I spent my week like I did the one before; writing tests, increasing the codecoverage of my Smack OMEMO module. Today the coverage finally reached the same level as the main codebase, meaning my PR wouldn’t decrease the percentage anymore. Apart from that I’ve read some tips on java.nio usage for file transfer and made myself more familiar with its non-blocking IO concepts.

Throughout the week I took the one or other chance to distract myself from work by pariticipating in OMEMO related discussions on the standards mailing list. I’m quite happy, that there’s a vital discussion on the topic which seems to have a solution in prospect which is acceptable for everyone. Specifying the OMEMO XEP in a way that enables implementations using different crypto libraries is definitely a huge step forward which might bring a broader adoption without leaving those who pioneered and developed the standard standing in the rain (all my subjective opinion). I was really surprised to see developers of the Matrix project participating in the discussion. That reminded me of what the spirit of floss software really is :)

I plan to spent the last days before the coding phase sketching out my projects structure and relaxing a little before the hard work begins. One of my goals is to plan ahead and I really hope to fulfill this goal.

Happy Hacking :)

Vanitasvitae

May 22 2017

Paul Boddie's Free Software-related blog » English: VGA Signal Generation with the PIC32

It all started after I had designed – and received from fabrication – a circuit board for prototyping cartridges for the Acorn Electron microcomputer. Although some prototyping had already taken place with an existing cartridge, with pins intended for ROM components being routed to drive other things, this board effectively “breaks out” all connections available to a cartridge that has been inserted into the computer’s Plus 1 expansion unit.

Acorn Electron cartridge breakout board

The Acorn Electron cartridge breakout board being used to drive an external circuit

One thing led to another, and soon my brother, David, was interfacing a microcontroller to the Electron in order to act as a peripheral being driven directly by the system’s bus signals. His approach involved having a program that would run and continuously scan the signals for read and write conditions and then interpret the address signals, sending and receiving data on the bus when appropriate.

Having acquired some PIC32 devices out of curiosity, with the idea of potentially interfacing them with the Electron, I finally took the trouble of looking at the datasheet to see whether some of the hard work done by David’s program might be handled by the peripheral hardware in the PIC32. The presence of something called “Parallel Master Port” was particularly interesting.

Operating this function in the somewhat insensitively-named “slave” mode, the device would be able to act like a memory device, with the signalling required by read and write operations mostly being dealt with by the hardware. Software running on the PIC32 would be able to read and write data through this port and be able to get notifications about new data while getting on with doing other things.

So began my journey into PIC32 experimentation, but this article isn’t about any of that, mostly because I put that particular investigation to one side after a degree of experience gave me perhaps a bit too much confidence, and I ended up being distracted by something far more glamorous: generating a video signal using the PIC32!

The Precedents’ Hall of Fame

There are plenty of people who have written up their experiments generating VGA and other video signals with microcontrollers. Here are some interesting examples:

And there are presumably many more pages on the Web with details of people sending pixel data over a cable to a display of some sort, often trying to squeeze every last cycle out of their microcontroller’s instruction processing unit. But, given an awareness of how microcontrollers should be able to take the burden off the programs running on them, employing peripheral hardware to do the grunt work of switching pins on and off at certain frequencies, maybe it would be useful to find examples of projects where such advantages of microcontrollers had been brought to bear on the problem.

In fact, I was already aware of the Maximite “single chip computer” partly through having seen the cloned version of the original being sold by Olimex – something rather resented by the developer of the Maximite for reasons largely rooted in an unfortunate misunderstanding of Free Software licensing on his part – and I was aware that this computer could generate a VGA signal. Indeed, the method used to achieve this had apparently been written up in a textbook for the PIC32 platform, albeit generating a composite video signal using one of the on-chip SPI peripherals. The Colour Maximite uses three SPI channels to generate one red, one green, and one blue channel of colour information, thus supporting eight-colour graphical output.

But I had been made aware of the Parallel Master Port (PMP) and its “master” mode, used to drive LCD panels with eight bits of colour information per pixel (or, using devices with many more pins than those I had acquired, with sixteen bits of colour information per pixel). Would it surely not be possible to generate 256-colour graphical output at the very least?

Information from people trying to use PMP for this purpose was thin on the ground. Indeed, reading again one article that mentioned an abandoned attempt to get PMP working in this way, using the peripheral to emit pixel data for display on a screen instead of a panel, I now see that it actually mentions an essential component of the solution that I finally arrived at. But the author had unfortunately moved away from that successful component in an attempt to get the data to the display at a rate regarded as satisfactory.

Direct Savings

It is one thing to have the means to output data to be sent over a cable to a display. It is another to actually send the data efficiently from the microcontroller. Having contemplated such issues in the past, it was not a surprise that the Maximite and other video-generating solutions use direct memory access (DMA) to get the hardware, as opposed to programs, to read through memory and to write its contents to a destination, which in most cases seemed to be the memory address holding output data to be emitted via a data pin using the SPI mechanism.

I had also envisaged using DMA and was still fixated on using PMP to emit the different data bits to the output circuit producing the analogue signals for the display. Indeed, Microchip promotes the PMP and DMA combination as a way of doing “low-cost controllerless graphics solutions” involving LCD panels, so I felt that there surely couldn’t be much difference between that and getting an image on my monitor via a few resistors on the breadboard.

And so, a tour of different PIC32 features began, trying to understand the DMA documentation, the PMP documentation, all the while trying to get a grasp of what the VGA signal actually looks like, the timing constraints of the various synchronisation pulses, and battle various aspects of the MIPS architecture and the PIC32 implementation of it, constantly refining my own perceptions and understanding and learning perhaps too often that there may have been things I didn’t know quite enough about before trying them out!

Using VGA to Build a Picture

Before we really start to look at a VGA signal, let us first look at how a picture is generated by the signal on a screen:

VGA Picture Structure

The structure of a display image or picture produced by a VGA signal

The most important detail at this point is the central area of the diagram, filled with horizontal lines representing the colour information that builds up a picture on the display, with the actual limits of the screen being represented here by the bold rectangle outline. But it is also important to recognise that even though there are a number of visible “display lines” within which the colour information appears, the entire “frame” sent to the display actually contains yet more lines, even though they will not be used to produce an image.

Above and below – really before and after – the visible display lines are the vertical back and front porches whose lines are blank because they do not appear on the screen or are used to provide a border at the top and bottom of the screen. Such extra lines contribute to the total frame period and to the total number of lines dividing up the total frame period.

Figuring out how many lines a display will have seems to involve messing around with something called the “generalised timing formula”, and if you have an X server like Xorg installed on your system, you may even have a tool called “gtf” that will attempt to calculate numbers of lines and pixels based on desired screen resolutions and frame rates. Alternatively, you can look up some common sets of figures on sites providing such information.

What a VGA Signal Looks Like

Some sources show diagrams attempting to describe the VGA signal, but many of these diagrams are open to interpretation (in some cases, very much so). They perhaps show the signal for horizontal (display) lines, then other signals for the entire image, but they either do not attempt to combine them, or they instead combine these details ambiguously.

For instance, should the horizontal sync (synchronisation) pulse be produced when the vertical sync pulse is active or during the “blanking” period when no pixel information is being transmitted? This could be deduced from some diagrams but only if you share their authors’ unstated assumptions and do not consider other assertions about the signal structure. Other diagrams do explicitly show the horizontal sync active during vertical sync pulses, but this contradicts statements elsewhere such as “during the vertical sync period the horizontal sync must also be held low”, for instance.

After a lot of experimentation, I found that the following signal structure was compatible with the monitor I use with my computer:

VGA Signal Structure

The basic structure of a VGA signal, or at least a signal that my monitor can recognise

There are three principal components to the signal:

  • Colour information for the pixel or line data forms the image on the display and it is transferred within display lines during what I call the visible display period in every frame
  • The horizontal sync pulse tells the display when each horizontal display line ends, or at least the frequency of the lines being sent
  • The vertical sync pulse tells the display when each frame (or picture) ends, or at least the refresh rate of the picture

The voltage levels appear to be as follows:

  • Colour information should be at 0.7V (although some people seem to think that 1V is acceptable as “the specified peak voltage for a VGA signal”)
  • Sync pulses are supposed to be at “TTL” levels, which apparently can be from 0V to 0.5V for the low state and from 2.7V to 5V for the high state

Meanwhile, the polarity of the sync pulses is also worth noting. In the above diagram, they have negative polarity, meaning that an active pulse is at the low logic level. Some people claim that “modern VGA monitors don’t care about sync polarity”, but since it isn’t clear to me what determines the polarity, and since most descriptions and demonstrations of VGA signal generation seem to use negative polarity, I chose to go with the flow. As far as I can tell, the gtf tool always outputs the same polarity details, whereas certain resources provide signal characteristics with differing polarities.

It is possible, and arguably advisable, to start out trying to generate sync pulses and just grounding the colour outputs until your monitor (or other VGA-capable display) can be persuaded that it is receiving a picture at a certain refresh rate and resolution. Such confirmation can be obtained on a modern display by seeing a blank picture without any “no signal” or “input not supported” messages and by being able to activate the on-screen menu built into the device, in which an option is likely to exist to show the picture details.

How the sync and colour signals are actually produced will be explained later on. This section was merely intended to provide some background and gather some fairly useful details into one place.

Counting Lines and Generating Vertical Sync Pulses

The horizontal and vertical sync pulses are each driven at their own frequency. However, given that there are a fixed number of lines in every frame, it becomes apparent that the frequency of vertical sync pulse occurrences is related to the frequency of horizontal sync pulses, the latter occurring once per line, of course.

With, say, 622 lines forming a frame, the vertical sync will occur once for every 622 horizontal sync pulses, or at a rate that is 1/622 of the horizontal sync frequency or “line rate”. So, if we can find a way of generating the line rate, we can not only generate horizontal sync pulses, but we can also count cycles at this frequency, and every 622 cycles we can produce a vertical sync pulse.

But how do we calculate the line rate in the first place? First, we decide what our refresh rate should be. The “classic” rate for VGA output is 60Hz. Then, we decide how many lines there are in the display including those extra non-visible lines. We multiply the refresh rate by the number of lines to get the line rate:

60Hz * 622 = 37320Hz = 37.320kHz

On a microcontroller, the obvious way to obtain periodic events is to use a timer. Given a particular frequency at which the timer is updated, a quick calculation can be performed to discover how many times a timer needs to be incremented before we need to generate an event. So, let us say that we have a clock frequency of 24MHz, and a line rate of 37.320kHz, we calculate the number of timer increments required to produce the latter from the former:

24MHz / 37.320kHz = 24000000Hz / 37320Hz = 643

So, if we set up a timer that counts up to 642 and then upon incrementing again to 643 actually starts again at zero, with the timer sending a signal when this “wraparound” occurs, we can have a mechanism providing a suitable frequency and then make things happen at that frequency. And this includes counting cycles at this particular frequency, meaning that we can increment our own counter by 1 to keep track of display lines. Every 622 display lines, we can initiate a vertical sync pulse.

One aspect of vertical sync pulses that has not yet been mentioned is their duration. Various sources suggest that they should last for only two display lines, although the “gtf” tool specifies three lines instead. Our line-counting logic therefore needs to know that it should enable the vertical sync pulse by bringing it low at a particular starting line and then disable it by bringing it high again after two whole lines.

Generating Horizontal Sync Pulses

Horizontal sync pulses take place within each display line, have a specific duration, and they must start at the same time relative to the start of each line. Some video output demonstrations seem to use lots of precisely-timed instructions to achieve such things, but we want to use the peripherals of the microcontroller as much as possible to avoid wasting CPU time. Having considered various tricks involving specially formulated data that might be transferred from memory to act as a pulse, I was looking for examples of DMA usage when I found a mention of something called the Output Compare unit on the PIC32.

What the Output Compare (OC) units do is to take a timer as input and produce an output signal dependent on the current value of the timer relative to certain parameters. In clearer terms, you can indicate a timer value at which the OC unit will cause the output to go high, and you can indicate another timer value at which the OC unit will cause the output to go low. It doesn’t take much imagination to realise that this sounds almost perfect for generating the horizontal sync pulse:

  1. We take the timer previously set up which counts up to 643 and thus divides the display line period into units of 1/643.
  2. We identify where the pulse should be brought low and present that as the parameter for taking the output low.
  3. We identify where the pulse should be brought high and present that as the parameter for taking the output high.

Upon combining the timer and the OC unit, then configuring the output pin appropriately, we end up with a low pulse occurring at the line rate, but at a suitable offset from the start of each line.

VGA Display Line Structure

The structure of each visible display line in the VGA signal

In fact, the OC unit also proves useful in actually generating the vertical sync pulses, too. Although we have a timer that can tell us when it has wrapped around, we really need a mechanism to act upon this signal promptly, at least if we are to generate a clean signal. Unfortunately, handling an interrupt will introduce a delay between the timer wrapping around and the CPU being able to do something about it, and it is not inconceivable that this delay may vary depending on what the CPU has been doing.

So, what seems to be a reasonable solution to this problem is to count the lines and upon seeing that the vertical sync pulse should be initiated at the start of the next line, we can enable another OC unit configured to act as soon as the timer value is zero. Thus, upon wraparound, the OC unit will spring into action and bring the vertical sync output low immediately. Similarly, upon realising that the next line will see the sync pulse brought high again, we can reconfigure the OC unit to do so as soon as the timer value again wraps around to zero.

Inserting the Colour Information

At this point, we can test the basic structure of the signal and see if our monitor likes it. But none of this is very interesting without being able to generate a picture, and so we need a way of getting pixel information from the microcontroller’s memory to its outputs. We previously concluded that Direct Memory Access (DMA) was the way to go in reading the pixel data from what is usually known as a framebuffer, sending it to another place for output.

As previously noted, I thought that the Parallel Master Port (PMP) might be the right peripheral to use. It provides an output register, confusingly called the PMDIN (parallel master data in) register, that lives at a particular address and whose value is exposed on output pins. On the PIC32MX270, only the least significant eight bits of this register are employed in emitting data to the outside world, and so a DMA destination having a one-byte size, located at the address of PMDIN, is chosen.

The source data is the framebuffer, of course. For various retrocomputing reasons hinted at above, I had decided to generate a picture 160 pixels in width, 256 lines in height, and with each byte providing eight bits of colour depth (specifying how many distinct colours are encoded for each pixel). This requires 40 kilobytes and can therefore reside in the 64 kilobytes of RAM provided by the PIC32MX270. It was at this point that I learned a few things about the DMA mechanisms of the PIC32 that didn’t seem completely clear from the documentation.

Now, the documentation talks of “transactions”, “cells” and “blocks”, but I don’t think it describes them as clearly as it could do. Each “transaction” is just a transfer of a four-byte word. Each “cell transfer” is a collection of transactions that the DMA mechanism performs in a kind of batch, proceeding with these as quickly as it can until it either finishes the batch or is told to stop the transfer. Each “block transfer” is a collection of cell transfers. But what really matters is that if you want to transfer a certain amount of data and not have to keep telling the DMA mechanism to keep going, you need to choose a cell size that defines this amount. (When describing this, it is hard not to use the term “block” rather than “cell”, and I do wonder why they assigned these terms in this way because it seems counter-intuitive.)

You can perhaps use the following template to phrase your intentions:

I want to transfer <cell size> bytes at a time from a total of <block size> bytes, reading data starting from <source address>, having <source size>, and writing data starting at <destination address>, having <destination size>.

The total number of bytes to be transferred – the block size – is calculated from the source and destination sizes, with the larger chosen to be the block size. If we choose a destination size less than the source size, the transfers will not go beyond the area of memory defined by the specified destination address and the destination size. What actually happens to the “destination pointer” is not immediately obvious from the documentation, but for our purposes, where we will use a destination size of one byte, the DMA mechanism will just keep writing source bytes to the same destination address over and over again. (One might imagine the pointer starting again at the initial start address, or perhaps stopping at the end address instead.)

So, for our purposes, we define a “cell” as 160 bytes, being the amount of data in a single display line, and we only transfer one cell in a block. Thus, the DMA source is 160 bytes long, and even though the destination size is only a single byte, the DMA mechanism will transfer each of the source bytes into the destination. There is a rather unhelpful diagram in the documentation that perhaps tries to communicate too much at once, leading one to believe that the cell size is a factor in how the destination gets populated by source data, but the purpose of the cell size seems only to be to define how much data is transferred at once when a transfer is requested.

DMA Transfer Mechanism

The transfer of framebuffer data to PORTB using DMA cell transfers (noting that this hints at the eventual approach which uses PORTB and not PMDIN)

In the matter of requesting a transfer, we have already described the mechanism that will allow us to make this happen: when the timer signals the start of a new line, we can use the wraparound event to initiate a DMA transfer. It would appear that the transfer will happen as fast as both the source and the destination will allow, at least as far as I can tell, and so it is probably unlikely that the data will be sent to the destination too quickly. Once the transfer of a line’s pixel data is complete, we can do some things to set up the transfer for the next line, like changing the source data address to point to the next 160 bytes representing the next display line.

(We could actually set the block size to the length of the entire framebuffer – by setting the source size – and have the DMA mechanism automatically transfer each line in turn, updating its own address for the current line. However, I intend to support hardware scrolling, where the address of the first line of the screen can be adjusted so that the display starts part way through the framebuffer, reaches the end of the framebuffer part way down the screen, and then starts again at the beginning of the framebuffer in order to finish displaying the data at the bottom of the screen. The DMA mechanism doesn’t seem to support the necessary address wraparound required to manage this all by itself.)

Output Complications

Having assumed that the PMP peripheral would be an appropriate choice, I soon discovered some problems with the generated output. Although the data that I had stored in the RAM seemed to be emitted as pixels in appropriate colours, there were gaps between the pixels on the screen. Yet the documentation seemed to vaguely indicate that the PMDIN register was more or less continuously updated. That meant that the actual output signals were being driven low between each pixel, causing black-level gaps and ruining the result.

I wondered if anything could be done about this issue. PMP is really intended as some kind of memory interface, and it isn’t unreasonable for it to only maintain valid data for certain periods of time, modifying control signals to define this valid data period. That PMP can be used to drive LCD panels is merely a result of those panels themselves upholding this kind of interface. For those of you familiar with microcontrollers, the solution to my problem was probably obvious several paragraphs ago, but it needed me to reconsider my assumptions and requirements before I realised what I should have been doing all along.

Unlike SPI, which concerns itself with the bit-by-bit serial output of data, PMP concerns itself with the multiple-bits-at-once parallel output of data, and all I wanted to do was to present multiple bits to a memory location and have them translated to a collection of separate signals. But, of course, this is exactly how normal I/O (input/output) pins are provided on microcontrollers! They all seem to provide “PORT” registers whose bits correspond to output pins, and if you write a value to those registers, all the pins can be changed simultaneously. (This feature is obscured by platforms like Arduino where functions are offered to manipulate only a single pin at once.)

And so, I changed the DMA destination to be the PORTB register, which on the PIC32MX270 is the only PORT register with enough bits corresponding to I/O pins to be useful enough for this application. Even then, PORTB does not have a complete mapping from bits to pins: some pins that are available in other devices have been dedicated to specific functions on the PIC32MX270F256B and cannot be used for I/O. So, it turns out that we can only employ at most seven bits of our pixel data in generating signal data:

PORTB Pin Availability on the PIC32MX270F256B Pins … 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 … RPB15 RPB14 RPB13 RPB11 RPB10 RPB9 RPB8 RPB7 RPB5 RPB4 RPB3 RPB2 RPB1 RPB0

We could target the first byte of PORTB (bits 0 to 7) or the second byte (bits 8 to 15), but either way we will encounter an unmapped bit. So, instead of choosing a colour representation making use of eight bits, we have to make do with only seven.

Initially, not noticing that RPB6 was not available, I was using a “RRRGGGBB” or “332″ representation. But persuaded by others in a similar predicament, I decided to choose a representation where each colour channel gets two bits, and then a separate intensity bit is used to adjust the final intensity of the basic colour result. This also means that greyscale output is possible because it is possible to balance the channels.

The 2-bit-per-channel plus intensity colours

The colours employing two bits per channel plus one intensity bit, perhaps not shown completely accurately due to circuit inadequacies and the usual white balance issues when taking photographs

It is worth noting at this point that since we have left the 8-bit limitations of the PMP peripheral far behind us now, we could choose to populate two bytes of PORTB at once, aiming for sixteen bits per pixel but actually getting fourteen bits per pixel once the unmapped bits have been taken into account. However, this would double our framebuffer memory requirements for the same resolution, and we don’t have that much memory. There may be devices with more than sixteen bits mapped in the 32-bit PORTB register (or in one of the other PORT registers), but they had better have more memory to be useful for greater colour depths.

Back in Black

One other matter presented itself as a problem. It is all very well generating a colour signal for the pixels in the framebuffer, but what happens at the end of each DMA transfer once a line of pixels has been transmitted? For the portions of the display not providing any colour information, the channel signals should be held at zero, yet it is likely that the last pixel on any given line is not at the lowest possible (black) level. And so the DMA transfer will have left a stray value in PORTB that could then confuse the monitor, producing streaks of colour in the border areas of the display, making the monitor unsure about the black level in the signal, and also potentially confusing some monitors about the validity of the picture, too.

As with the horizontal sync pulses, we need a prompt way of switching off colour information as soon as the pixel data has been transferred. We cannot really use an Output Compare unit because that only affects the value of a single output pin, and although we could wire up some kind of blanking in our external circuit, it is simpler to look for a quick solution within the capabilities of the microcontroller. Fortunately, such a quick solution exists: we can “chain” another DMA channel to the one providing the pixel data, thereby having this new channel perform a transfer as soon as the pixel data has been sent in its entirety. This new channel has one simple purpose: to transfer a single byte of black pixel data. By doing this, the monitor will see black in any borders and beyond the visible regions of the display.

Wiring Up

Of course, the microcontroller still has to be connected to the monitor somehow. First of all, we need a way of accessing the pins of a VGA socket or cable. One reasonable approach is to obtain something that acts as a socket and that breaks out the different signals from a cable, connecting the microcontroller to these broken-out signals.

Wanting to get something for this task quickly and relatively conveniently, I found a product at a local retailer that provides a “male” VGA connector and screw-adjustable terminals to break out the different pins. But since the VGA cable also has a male connector, I also needed to get a “gender changer” for VGA that acts as a “female” connector in both directions, thus accommodating the VGA cable and the male breakout board connector.

Wiring up to the broken-out VGA connector pins is mostly a matter of following diagrams and the pin numbering scheme, illustrated well enough in various resources (albeit with colour signal transposition errors in some resources). Pins 1, 2 and 3 need some special consideration for the red, green and blue signals, and we will look at them in a moment. However, pins 13 and 14 are the horizontal and vertical sync pins, respectively, and these can be connected directly to the PIC32 output pins in this case, since the 3.3V output from the microcontroller is supposedly compatible with the “TTL” levels. Pins 5 through 10 can be connected to ground.

We have seen mentions of colour signals with magnitudes of up to 0.7V, but no explicit mention of how they are formed has been presented in this article. Fortunately, everyone is willing to show how they converted their digital signals to an analogue output, with most of them electing to use a resistor network to combine each output pin within a channel to produce a hopefully suitable output voltage.

Here, with two bits per channel, I take the most significant bit for a channel and send it through a 470ohm resistor. Meanwhile, the least significant bit for the channel is sent through a 1000ohm resistor. Thus, the former contributes more to the magnitude of the signal than the latter. If we were only dealing with channel information, this would be as much as we need to do, but here we also employ an intensity bit whose job it is to boost the channels by a small amount, making sure not to allow the channels to pollute each other via this intensity sub-circuit. Here, I feed the intensity output through a 2200ohm resistor and then to each of the channel outputs via signal diodes.

VGA Output Circuit

The circuit showing connections relevant to VGA output (generic connections are not shown)

The Final Picture

I could probably go on and cover other aspects of the solution, but the fundamental aspects are probably dealt with sufficiently above to help others reproduce this experiment themselves. Populating memory with usable image data, at least in this solution, involves copying data to RAM, and I did experience problems with accessing RAM that are probably related to CPU initialisation (as covered in my previous article) and to synchronising the memory contents with what the CPU has written via its cache.

As for the actual picture data, the RGB-plus-intensity representation is not likely to be the format of most images these days. So, to prepare data for output, some image processing is needed. A while ago, I made a program to perform palette optimisation and dithering on images for the Acorn Electron, and I felt that it was going to be easier to adapt the dithering code than it was to figure out the necessary techniques required for software like ImageMagick or the Python Imaging Library. The pixel data is then converted to assembly language data definition statements and incorporated into my PIC32 program.

VGA output from a PIC32 microcontroller

VGA output from a PIC32 microcontroller, featuring a picture showing some Oslo architecture, with the PIC32MX270 being powered (and programmed) by the Arduino Duemilanove, and with the breadboards holding the necessary resistors and diodes to supply the VGA breakout and, beyond that, the cable to the monitor

To demonstrate control over the visible region, I deliberately adjusted the display frequencies so that the monitor would consider the signal to be carrying an image 800 pixels by 600 pixels at a refresh rate of 60Hz. Since my framebuffer is only 256 lines high, I double the lines to produce 512 lines for the display. It would seem that choosing a line rate to try and produce 512 lines has the monitor trying to show something compatible with the traditional 640×480 resolution and thus lines are lost off the screen. I suppose I could settle for 480 lines or aim for 300 lines instead, but I actually don’t mind having a border around the picture.

The on-screen menu showing the monitor's interpretation of the signal

The on-screen menu showing the monitor's interpretation of the signal

It is worth noting that I haven’t really mentioned a “pixel clock” or “dot clock” so far. As far as the display receiving the VGA signal is concerned, there is no pixel clock in that signal. And as far as we are concerned, the pixel clock is only important when deciding how quickly we can get our data into the signal, not in actually generating the signal. We can generate new colour values as slowly (or as quickly) as we might like, and the result will be wider (or narrower) pixels, but it shouldn’t make the actual signal invalid in any way.

Of course, it is important to consider how quickly we can generate pixels. Previously, I mentioned a 24MHz clock being used within the PIC32, and it is this clock that is used to drive peripherals and this clock’s frequency that will limit the transfer speed. As noted elsewhere, a pixel clock frequency of 25MHz is used to support the traditional VGA resolution of 640×480 at 60Hz. With the possibilities of running the “peripheral clock” in the PIC32MX270 considerably faster than this, it becomes a matter of experimentation as to how many pixels can be supported horizontally.

Some Oslo street art being displayed by the PIC32

Some Oslo street art being displayed by the PIC32

For my own purposes, I have at least reached the initial goal of generating a stable and usable video signal. Further work is likely to involve attempting to write routines to modify the framebuffer, maybe support things like scrolling and sprites, and even consider interfacing with other devices.

Naturally, this project is available as Free Software from its own repository. Maybe it will inspire or encourage you to pursue something similar, knowing that you absolutely do not need to be any kind of “expert” to stubbornly persist and to eventually get results!

Free Software – Frank Karlitschek_: The development of Global Scale

The architecture of Nextcloud is a classic Web Application architecture. I picked this architecture 7.5 years ago because it is very well known and is proven to be scaled relatively easily. This usually works with off the shelf technologies like http load balancers, clusters of Linux webservers and clustered databases.

But for many years users and customers asked for ways to distribute a single instance over several datacenters. A lot of users and customers run organizations that are not in once office or sometimes not even in once country or on one continent. So how can the service run distributed over different hosting centers on different continent?

Until now there was no good answer for this requirement.

Over the years I talked with users who experimented with different approaches. Unfortunately, they all didn’t work.

If you talk to storage people how to solve this challenge they say: No problem. Just use a distributed storage system like Gluster, Ceph, Hadoop or other.

If you talk to database people and how to do this they say: No problem! Just use one of the cluster and replication systems for Oracle, MySQL, MariaDB and others.

The challenge is how this all works together. If a user changes a file in Nextcloud then it is necessary that the file is changed, potential encryption keys are updated, log files are written, database tables are changed, email notification are send, external workflow scripts are executed and a lot of other things happen. All this operations that are triggered by a single file change have to happen in an atomic way. It happens completely or not at all. Using database replication and storage replication independently will lead to a broken state and data loss. Additional problems are that you need the full bandwidth and storage in all data centers. So there is a lot of overhead.

The Global Scale architecture, that we designed, is currently the only solution, as far as I know, which solves this challenge.

Additional benefits are that the storage and database and overall operational costs decreases because simpler, commodity and more standard components can be used.

Another benefit is that the locality of data can be controlled. So if it is a legal requirement that certain files never leave a certain jurisdiction then this can be guaranteed with the right GS Balancer and File Access Control settings.

So far I only talk about the benefit that GS breaks a service down into several data centers. This is only half of the truth. Nextcloud Global Scale can be used in an even more radical way. You could stop using clustered Nextcloud instances in general. Killing all central storage boxes and databases and move completely to commodity hardware. Using only local storage and local databases and local caching. This changes the world completely and makes big storages, SAN, NFS and object store boxes completely obsolete.

The Global Scale architecture idea and implementation was developed over the last year with feedback and ideas from many different people. Storage experts, Database vendors, Linux distribution people, container and orchestration experts for easy automatic deployment and several big customers and users of Nextcloud.

The main inspiration came out of long discussions with the people at DeiC who is running the National Research and Education Network of Denmark. DeiC was doing successful experiments with a similar architecture for a while already.

At Nextcloud we are committed to develop everything we do as open source. This includes this feature. Also, if you want to contribute to this architecture then no Contributer License Agreement is needed and you don’t need to transfer any rights to the Nextcloud company.

More information about Nextcloud Global Scale can be found here: https://nextcloud.com/globalscale

May 20 2017

vanitasvitae's blog » englisch: GSoC: Second, third week of community bonding

Hi all!

This is my report for the second, as well as the first half of the third week of GSoC community bonding, which I spent again working on finalizing my OMEMO code.

I dug deeper into writing test cases, mainly integration tests and I found quite a lot of small, undetectable bugs this way (Yay!). This strengthens my plan to work test driven during my GSoC project. The OMEMO code I currently work on was started as part of my bachelor thesis about 4 to 5 months ago and at this time, I was more concerned about having working code in the end, so I wrote no tests at all. Deploying test cases AFTER the code is already written is not only a tideous task, but its also often very difficult (because the code is not structured properly). So I learned my lesson the hard way :D

During testing I also found another bug in an XMPP server software, which prevents Smack from creating accounts on the server on the fly. Unfortunatelly this bug will not get fixed anymore for the version I use (installed from debian testing repository, which I thought was *reasonable* new), which keeps me from doing proper testing the way its meant to be done. I don’t have the time to compile the server software myselves. Instead, I work around this issue by creating the accounts manually everytime I run the test suite using a small bashscript.

I also had to deal with a really strange bug with file writing and reading. smack-omemo has a set of 4 integration tests, which all write data into a temporary directory. After each test, the directory is deleted to prevent tests influencing eachother. The issue was, that only the first test could read/write to the test directory. All subsequent tests failed for some reason. It took me a long time to notice, that there were two folders created (one in the working directory, another one in the subdirectory of the integration test framework). I am still not really sure what happened. The first folder was logged in all debug output, while files were written (by the first test) to the second filder. I guess it was caused by the temp directory being specified using a relative path, which messed up the tests, which were instanciated by the test framework using reflection. But I’m really not sure about this. Specifying the directory using an absolute path fixed the issue in the end.

Last but not least, me and Flow worked out some more details about my GSoC project (Implementing (encrypted) Jingle file transfer for Smack). The module will most likely be based upon java.nio to be scalable in the future. Flow also emphasized that the API should be as easy to use as possible, but at the same time powerful and extensible, which is a nice challenge (and probably a common one within the XMPP community). My initial plan was to create a XEP for OMEMO encrypted Jingle file transfer. We decided, that it would be of more value, to specify the XEP in a way, which allows arbitrary encryption techniques instead of being OMEMO exclusive.

Currently there is a little bit of tension in the community regarding the OMEMO specification. I really hope (and believe) there is a solution which is suitable of making everybody happy and I’m looking forward to participate in an open discussion :)

Happy hacking!

Daniel's FSFE blog: Benchmarking microSD cards and more

It’s worth re-visiting this blog-post as I am going to do more benchmarks and add more results over time.

Motivation

If you ever tried using a flash card (such a SD, microSD or USB drive) for your root or home filesystem on a small computing device or smartphone, you probably have noticed that flash cards are in most cases a lot slower than integreted eMMC flash. Since most filesystems use 4k blocks, the random write/read performance using 4k blocks is what matters most in such scenarios. And while flash cards don’t come close to internal flash in these disciplines, there are significant differences between the models.

Jeff Geerling [1,2] has already benchmarked the performance of various microSD cards on different models of the “blobby” Raspberry Pi. I had a number of different microSD cards at hand and I tried to replicate his results on my sample. In addition, I extended the comparison to regular SD cards and USB drives.

Disclaimer


All data and information provided in this article is for informational purposes only. The author makes no representations as to accuracy, completeness, currentness, suitability, or validity of any information on this article and will not be liable for any errors, omissions, or delays in this information or any losses, injuries, or damages arising from its display or use. All information is provided on an as-is basis.

In no event the author we be liable for any loss or damage including without limitation, indirect or consequential loss or damage, or any loss or damage whatsoever arising from loss of data or profits arising out of, or in connection with, the use of this article.

Environment and tools

I encourage every reader to replicate my benchmarks and share their results. Therefore, I first want to describe how the measures were taken.

General equipment

For the benchmarks, I used the following hardware on my workstation (listing only relevant parts):

  • Logilink CR0015 USB 2.0 card reader [4]
  • Logilink CR0034A USB 3.0 card reader [5]
  • Renesas Technology Corp. uPD720201 USB 3.0 Host Controller (rev 03)
  • Debian 9 “Stretch”

iozone Benchmarks

For the first tests, Just like Jeff, I used the open source (but non-free) tool “iozone” [3] in the current stable version that is available on Debian Stretch (3.429).

I disabled caches and benchmarked on raw devices to avoid measuring filesystem overhead. Therefore, I used the following call to iozone to run the benchmarks (/dev/sde is my sdcard):

$ iozone -e -I -a -s 100M -r 4k -r 16M -i 0 -i 1 -i 2 -f /dev/sdX

(replacing sdX by the actual device name)

ddrescue benchmarks

I used GNU ddrescue (1.21) in synchronous mode to overwrite a device with zeroes and measured the performance there. For this purpose, I used the following call to do a write benchmark:

$ ddrescue -D --force /dev/zero /dev/sdX

(replacing sdX by the actual device name)

Similarly, to measure the read performance I used (I had more than 100GB RAM left for tmpfs):

$ ddrescue -D --force /dev/sdX /tmp/foo

(replacing sdX by the actual device name)

Results

iozone

The following table provides the results I obtained with the USB 2.0 card reader (rounded to 10kb/s):

Manufacturer Make / model Type Rating Capacity 16M seq. read (MB/s) 16M seq. write (MB/s) 4K rand. read (MB/s) 4K rand. write (MB/s) Adata ? mSD C4 32 GB 9.58 2.97 2.62 0.64 Samsung Evo mSD UHS-1 32 GB 17.80 10.13 4.45 1.32 Samsung Evo+ mSD UHS-1 32 GB 18.10 14.27 5.28 2.86 Samsung Evo+ (2017) mSD UHS-3 64 GB 16,98 8,03 7,99 2,99 Sandisk Extreme mSD UHS-3 64 GB 18.44 11.22 5.20 2.36 Sandisk Ultra mSD C10 64 GB 18.16 8.69 3.73 0.80 Toshiba Exceria mSD UHS-3 64 GB 16.10 6.60 3.82 0.09 Kingston ? mSD C4 8 GB 14.69 3.71 3.97 0.18

The next table provides the results I obtained with the USB 3.0 card reader (also rounded to 10kb/s):

Manufacturer Make / model Type Rating Capacity 16M seq. read (MB/s) 16M seq. write (MB/s) 4K rand. read (MB/s) 4K rand. write (MB/s) Samsung Evo+ (2017) mSD UHS-3 64 GB 42,47 40,10 8,48 2,29 Samsung Evo+ mSD UHS-1 32 GB 42,53 34,48 8,01 3,97 Sandisk Ultra mSD C10 64 GB 18,21 19,19 5,35 1,06

And here are the USB drives:

Manufacturer Make / model Type Remarks Capacity 16M seq. read (MB/s) 16M seq. write (MB/s) 4K rand. read (MB/s) 4K rand. write (MB/s) Sandisk Cruzer USB 2.0 - 4 GB 22,89 4,05 6,86 0,44 Kingston Data Traveller G3 USB 2.0 - 8 GB 21,52 4,96 6,96 0,08 Super Talent Pico C USB 2.0 gold 8 GB 36,72 13,77 DNF DNF

ddrescue

Using the USB 3.0 card reader, I obtained the following results:

Manufacturer Make / model Type Rating Capacity Avg read (MB/s) Avg write (MB/s) Sandisk Evo+ mSD C4 32 GB 41,95 21,54

Various USB drives:

Manufacturer Make / model Type Remarks Capacity Avg read (MB/s) Avg write (MB/s) Sandisk Cruzer USB 2.0 - 4 GB 36,37 2,64 Super Talent Pico C USB 2.0 gold 8 GB 36,29 19,80

Verdict and future work

I can confirm Jeff’s results about microSD cards and would also recommend the Evo+ (both the old and the 2017 model) which have the best 4K random write performance of the sample. On the other hand, I am very disappointed about the Toshiba Exceria card. Actually running a device on this card with a very sluggish performance was the reason why I took this benchmark initiative. And indeed, after switching to the Evo+, the device feels much snappier now.

Of course, it would be worthwhile to add more cards and drives to this benchmark. Also, using fio instead of the non-free iozone might be interesting. Furthermore, doing the benchmarks internally on the device or using a other USB 3.0 card readers might be also interesting.

References

[1] http://www.pidramble.com/wiki/benchmarks/microsd-cards
[2] http://www.jeffgeerling.com/blogs/jeff-geerling/raspberry-pi-microsd-card
[3] http://www.iozone.org/
[4] http://www.logilink.eu/Products_LogiLink/Notebook-Computer_Accessories/Card_Reader/Cardreader_USB_20_Stick_external_for_SD-MMC_CR0015.htm
[5] http://www.logilink.org/Produkte_LogiLink/Notebook-Computerzubehoer/Kartenleser/Cardreader_USB_30_SD-SD-HC-Micro_SD-Micro_SD-HC_CR0034A.htm?seticlanguage=en

May 19 2017

Paul Boddie's Free Software-related blog » English: Evaluating PIC32 for Hardware Experiments

Some time ago I became aware of the PIC32 microcontroller platform, perhaps while following various hardware projects, pursuing hardware-related interests, and looking for pertinent documentation. Although I had heard of PIC microcontrollers before, my impression of them was that they were mostly an alternative to other low-end computing products like the Atmel AVR series, but with a development ecosystem arguably more reliant on its vendor than the Atmel products for which tools like avr-gcc and avrdude exist as Free Software and have gone on to see extensive use, perhaps most famously in connection with the Arduino ecosystem.

What made PIC32 stand out when I first encountered it, however, was that it uses the MIPS32 instruction set architecture instead of its own specialised architecture. Consequently, instead of being reliant on the silicon vendor and random third-party tool providers for proprietary tools, the possibility of using widely-used Free Software tools exists. Moreover, with a degree of familiarity with MIPS assembly language, thanks to the Ben NanoNote, I felt that there might be an opportunity to apply some of my elementary skills to another MIPS-based system and gain some useful experience.

Some basic investigation occurred before I made any attempt to acquire hardware. As anyone having to pay attention to the details of hardware can surely attest, it isn’t much fun to obtain something only to find that some necessary tool required for the successful use of that hardware happens to be proprietary, only works on proprietary platforms, or is generally a nuisance in some way that makes you regret the purchase. Looking around at various resources such as the Pinguino Web site gave me some confidence that there were people out there using PIC32 devices with Free Software. (Of course, the eventual development scenario proved to be different from that envisaged in these initial investigations, but previous experience has taught me to expect that things may not go as originally foreseen.)

Some Discoveries

So that was perhaps more than enough of an introduction, given that I really want to focus on some of my discoveries in using my acquired PIC32 devices, hoping that writing them up will help others to use Free Software with this platform. So, below, I will present a few discoveries that may well, for all I know, be “obvious” to people embedded in the PIC universe since it began, or that might be “superfluous” to those happy that Microchip’s development tools can obscure the operation of the hardware to offer a simpler “experience”.

I should mention at this point that I chose to acquire PDIP-profile products for use with a solderless breadboard. This is the level of sophistication at which the Arduino products mostly operate and it allows convenient prototyping involving discrete components and other electronic devices. The evidence from the chipKIT and Pinguino sites suggested that it would be possible to set up a device on a breadboard, wire it up to a power supply with a few supporting components, and then be able to program it. (It turned out that programming involved another approach than indicated by that latter reference, however.)

The 28-pin product I elected to buy was the PIC32MX270F256B-50/SP. I also bought some capacitors that were recommended for wiring up the device. In the picture below, you can just about see the capacitor connecting two of the pins, and there is also a resistor pulling up one of the pins. I recommend obtaining a selection of resistors of different values so as to be able to wire up various circuits as you encounter them. Note that the picture does not show a definitive wiring guide: please refer to the product documentation or to appropriate project documentation for such things.

PIC32 device on a mini-breadboard

PIC32 device on a mini-breadboard connected to an Arduino Duemilanove for power and programming

Programming the Device

Despite the apparent suitability of a program called pic32prog, recommended by the “cheap DIY programmer” guide, I initially found success elsewhere. I suppose it didn’t help that the circuit diagram was rather hard to follow for me, as someone who isn’t really familiar with certain electrical constructs that have been mixed in, arguably without enough explanation.

Initial Recommendation: ArduPIC32

Instead, I looked for a solution that used an Arduino product (not something procured from ephemeral Chinese “auction site” vendors) and found ArduPIC32 living a quiet retirement in the Google Code Archive. Bypassing tricky voltage level conversion and connecting an Arduino Duemilanove with the PIC32 device on its 5V-tolerant JTAG-capable pins, ArduPIC32 mostly seemed to work, although I did alter it slightly to work the way I wanted to and to alleviate the usual oddness with Arduino serial-over-USB connections.

However, I didn’t continue to use ArduPIC. One reason is that programming using the JTAG interface is slow, but a much more significant reason is that the use of JTAG means that the pins on the PIC32 associated with JTAG cannot be used for other things. This is either something that “everybody” knows or just something that Microchip doesn’t feel is important enough to mention in convenient places in the product documentation. More on this topic below!

Final Recommendation: Pickle (and Nanu Nanu)

So, I did try and use pic32prog and the suggested circuit, but had no success. Then, searching around, I happened to find some very useful resources indeed: Pickle is a GPL-licensed tool for uploading data to different PIC devices including PIC32 devices; Nanu Nanu is a GPL-licensed program that runs on AVR devices and programs PIC32 devices using the ICSP interface. Compiling the latter for the Arduino and uploading it in the usual way (actually done by the Makefile), it can then run on the Arduino and be controlled by the Pickle tool.

Admittedly, I did have some problems with the programming circuit, most likely self-inflicted, but the developer of these tools was very responsive and, as I know myself from being in his position in other situations, provided the necessary encouragement that was perhaps most sorely lacking to get the PIC32 device talking. I used the “sink” or “open drain” circuit so that the Arduino would be driving the PIC32 device using a suitable voltage and not the Arduino’s native 5V. Conveniently, this is the default configuration for Nanu Nanu.

PIC32 on breadboard with Arduino programming circuit

PIC32 on breadboard with Arduino programming circuit (and some LEDs for diagnostic purposes)

I should point out that the Pinguino initiative promotes USB programming similar to that employed by the Arduino series of boards. Even though that should make programming very easy, it is still necessary to program the USB bootloader on the PIC32 device using another method in the first place. And for my experiments involving an integrated circuit on a breadboard, setting up a USB-based programming circuit is a distraction that would have complicated the familiarisation process and would have mostly duplicated the functionality that the Arduino can already provide, even though this two-stage programming circuit may seem a little contrived.

Compiling Code for the Device

This is perhaps the easiest problem to solve, strongly motivating my choice of PIC32 in the first place…

Recommendation: GNU Toolchain

PIC32 uses the MIPS32 instruction set. Since MIPS has been around for a very long time, and since the architecture was prominent in workstations, servers and even games consoles in the late 1980s and 1990s, remaining in widespread use in more constrained products such as routers as this century has progressed, the GNU toolchain (GCC, binutils) has had a long time to comfortably support MIPS. Although the computer you are using is not particularly likely to be MIPS-based, cross-compiling versions of these tools can be built to run on, say, x86 or x86-64 while generating MIPS32 executable programs.

And fortunately, Debian GNU/Linux provides the mipsel-linux-gnu variant of the toolchain packages (at least in the unstable version of Debian) that makes the task of building software simply a matter of changing the definitions for the compiler, linker and other tools in one’s Makefile to use these variants instead of the unprefixed “host” gcc, ld, and so on. You can therefore keep using the high-quality Free Software tools you already know. The binutils-mipsel-linux-gnu package provides an assembler, if you just want to practise your MIPS assembly language, as well as a linker and tools for creating binaries. Meanwhile, the gcc-mipsel-linux-gnu package provides a C compiler.

Perhaps the only drawback of using the GNU tools is that people using the proprietary tools supplied by Microchip and partners will post example code that uses special notation interpreted in a certain way by those products. Fortunately, with some awareness that this is going on, we can still support the necessary functionality in our own way, as described below.

Configuring the Device

With the PIC32, and presumably with the other PIC products, there is a distinct activity of configuring the device when programming it with program code and data. This isn’t so obvious until one reads the datasheets, tries to find a way of changing some behaviour of the device, and then stumbles across the DEVCFG registers. These registers cannot be set in a running program: instead, they are “programmed” before the code is run.

You might wonder what the distinction is between “programming” the device to take the code you have written and “programming” the configuration registers, and there isn’t much difference conceptually. All that matters is that you have your program written into one part of the device’s memory and you also ask for certain data values to be written to the configuration registers. How this is done in the Microchip universe is with something like this:

#pragma config SOMETHING = SOMEVALUE;

With a basic GNU tool configuration, we need to find the equivalent operation and express it in a different way (at least as far as I know, unless someone has implemented an option to support this kind of notation in the GNU tools). The mechanism for achieving this is related to the linker script and is described in a section of this article presented below. For now, we will concentrate on what configuration settings we need to change.

Recommendation: Disable JTAG

As briefly mentioned above, one thing that “everybody” knows, at least if they are using Microchip’s own tools and are busy copying code from Microchip’s examples, is that the JTAG functionality takes over various pins and won’t let you use them, even if you switch on a “peripheral” in the microcontroller that needs to use those pins. Maybe I am naive about how intrusive JTAG should or should not be, but the lesson from this matter is just to configure the device to not have JTAG features enabled and to use the ICSP programming method recommended above instead.

Recommendation: Disable Watchdog Timer

Again, although I am aware of the general concept of a watchdog timer – something that resets a device if it thinks that the device has hung, crashed, or experienced something particularly bad – I did not expect something like this to necessarily be configured by default. In case it is, and one does see lots of code assuming so, then it should be disabled as well. Otherwise, I can imagine that you might experience spontaneous resets for no obvious reason.

Recommendation: Configure the Oscillator and Clocks

If you need to change the oscillator frequency or the origin of the oscillator used by the PIC32, it is perhaps best to do this in the configuration registers rather than try and mess around with this while the device is running. Indeed, some configuration is probably unavoidable even if there is a need to, say, switch between oscillators at run-time. Meanwhile, the relationship between the system clock (used by the processor to execute instructions) and the peripheral clock (used to interact with other devices and to run things like timers) is defined in the configuration registers.

Linker Scripts

So, to undertake the matter of configuration, a way is needed to express the setting of configuration register values in a general way. For this, we need to take a closer look at linker scripts. If you routinely compile and link software, you will be using linker scripts without realising it, because such scripts are telling the tools things about where parts of the program should be stored, what kinds of addresses they should be using, and so on. Here, we need to take more of an interest in such matters.

Recommendation: Expand a Simple Script

Writing linker scripts does not seem like much fun. The syntax is awkward to read and to produce as a human being, and knowledge about tool output is assumed. However, it is possible to start with a simple script that works for someone else in a similar situation and to modify it conservatively in order to achieve what you need. I started out with one that just defined the memory regions and a few sections. To avoid reproducing all the details here, I will just show what the memory regions for a configuration register look like:

  config2                    : ORIGIN = 0xBFC00BF4, LENGTH = 0x4
  physical_config2           : ORIGIN = 0x3FC00BF4, LENGTH = 0x4

This will be written in the MEMORY construct in the script. What they tell us is that config2 is a region of four bytes starting at virtual address 0xBFC00BF4, which is the location of the DEVCFG2 register as specified in the PIC32MX270 documentation. However, this register actually has a physical address of 0x3FC00BF4. The difference between virtual addresses and physical addresses is perhaps easiest to summarise by saying that CPU instructions would use the virtual address when referencing the register, whereas the actual memory of the device employs the physical address to refer to this four-byte region.

Meanwhile, in the SECTIONS construct, there needs to be something like this:

  .devcfg2 : {
        *(.devcfg2)
        } > config2 AT > physical_config2

Now you might understand my remark about the syntax! Nevertheless, what these things do is to tell the tools to put things from a section called .devcfg2 in the physical_config2 memory region and, if there were to be any address references in the data (which there isn’t in this case), then they would use the addresses in the config2 region.

Recommendation: Define Configuration Sections and Values in the Code

Since I have been using assembly language, here is what I do in my actual program source file. Having looked at the documentation and figured out which configuration register I need to change, I introduce a section in the code that defines the register value. For DEVCFG2, it looks like this:

.section .devcfg2, "a"
.word 0xfff9fffb        /* DEVCFG2<18:16> = FPLLODIV<2:0> = 001;
                        DEVCFG2<6:4> = FPLLMUL<2:0> = 111;
                        DEVCFG2<2:0> = FPLLIDIV<2:0> = 011 */

Here, I fully acknowledge that this might not be the most optimal approach, but when you’re learning about all these things at once, you make progress as best you can. In any case, what this does is to tell the assembler to include the .devcfg2 section and to populate it with the specified “word”, which is four bytes on the 32-bit PIC32 platform. This word contains the value of the register which has been expressed in hexadecimal with the most significant digit first.

Returning to our checklist of configuration tasks, what we now need to do is to formulate each one in terms of configuration register values, introduce the necessary sections and values, and adjust the values to contain the appropriate bit patterns. The above example shows how the DEVCFG2 bits are adjusted and set in the value of the register. Here is a short amplification of the operation:

DEVCFG2 Bits 31…28 27…24 23…20 19…16 15…12 11…8 7…4 3…0 1111 1111 1111 1001
FPLLODIV<2:0> 1111 1111 1111
FPLLMUL<2:0> 1011
FPLLIDIV<2:0> f f f 9 f f f b

Here, the underlined bits are those of interest and have been changed to the desired values. It turns out that we can set the other bits as 1 for the functionality we want (or don’t want) in this case.

By the way, the JTAG functionality is disabled in the DEVCFG0 register (JTAGEN, bit 2, on this device). The watchdog timer is disabled in DEVCFG1 (FWDTEN, bit 23, on this device).

Recommendation: Define Regions for Exceptions and Interrupts

The MIPS architecture has the processor jump to certain memory locations when bad things happen (exceptions) or when other things happen (interrupts). We will cover the details of this below, but while we are setting up the places in memory where things will reside, we might as well define where the code to handle exceptions and interrupts will be living:

  .flash : { *(.flash*) } > kseg0_program_mem AT > physical_program_mem

This will be written in the SECTIONS construct. It relies on a memory region being defined, which would appear in the MEMORY construct as follows:

  kseg0_program_mem    (rx)  : ORIGIN = 0x9D000000, LENGTH = 0x40000
  physical_program_mem (rx)  : ORIGIN = 0x1D000000, LENGTH = 0x40000

These definitions allow the .flash section to be placed at 0x9D00000 but actually be written to memory at 0x1D00000.

Initialising the Device

On the various systems I have used in the past, even when working in assembly language I never had to deal with the earliest stages of the CPU’s activity. However, on the MIPS systems I have used in more recent times, I have been confronted with the matter of installing code to handle system initialisation, and this does require some knowledge of what MIPS processors would like to do when something goes wrong or if some interrupt arrives and needs to be processed.

The convention with the PIC32 seems to be that programs are installed within the MIPS KSEG0 region (one of the four principal regions) of memory, specifically at address 0x9FC00000, and so in the linker script we have MEMORY definitions like this:

  kseg0_boot_mem       (rx)  : ORIGIN = 0x9FC00000, LENGTH = 0xBF0
  physical_boot_mem    (rx)  : ORIGIN = 0x1FC00000, LENGTH = 0xBF0

As you can see, this region is far shorter than the 512MB of the KSEG0 region in its entirety. Indeed, 0xBF0 is only 3056 bytes! So, we need to put more substantial amounts of code elsewhere, some of which will be tasked with handling things when they go wrong.

Recommendation: Define Exception and Interrupt Handlers

As we have seen, the matter of defining routines to handle errors and interrupt conditions falls on the unlucky Free Software developer in this case. When something goes wrong, like the CPU not liking an instruction it has just seen, it will jump to a predefined location and try and execute code to recover. By default, with the PIC32, this location will be at address 0×80000000 which is the start of RAM, but if the RAM has not been configured then the CPU will not achieve very much trying to load instructions from that address.

Now, it can be tempting to set the “exception base”, as it is known, to be the place where our “boot” code is installed (at 0x9FC00000). So if something bad does happen, our code will start running again “from the top”, a bit like what used to happen if you ever wrote BASIC code that said…

10 ON ERROR GOTO 10

Clearly, this isn’t particularly desirable because it can mask problems. Indeed, I found that because I had not observed a step in the little dance required to change interrupt locations, my program would be happily restarting itself over and over again upon receiving interrupts. This is where the work done in the linker script starts to pay off: we can move the exception handler, this being the code residing at the exception base, to another region of memory and tell the CPU to jump to that address instead. We should therefore never have unscheduled restarts occurring once this is done.

Again, I have been working in assembly language, so I position my exception handling code using a directive like this:

.section .flash, "a"

exception_handler:
    /* Exception handling code goes here. */

Note that .flash is what we mentioned in the linker script. After this directive, the exception handler is defined so that the CPU has something to jump to. But exceptions are just one kind of condition that may occur, and we also need to handle interrupts. Although we might be able to handle both things together, you can instead position an interrupt handler after the exception handler at a well-defined offset, and the CPU can be told to use that when it receives an interrupt. Here is what I do:

.org 0x200

interrupt_handler:
    /* Interrupt handling code goes here. *

The .org directive tells the assembler that the interrupt handler will reside at an offset of 0×200 from the start of the .flash section. This number isn’t something I have made up: it is defined by the MIPS architecture and will be observed by the CPU when suitably configured.

So that leaves the configuration. Although one sees a lot of advocacy for the “multi-vector” interrupt handling support of the PIC32, possibly because the Microchip example code seems to use it, I wanted to stick with the more widely available “single-vector” support which is what I have effectively described above: you have one place the CPU jumps to and it is then left to the code to identify the interrupt condition – what it is that happened, exactly – and then handle the condition appropriately. (Multi-vector handling has the CPU identify the kind of condition and then choose a condition-specific location to jump to all by itself.)

The following steps are required for this to work properly:

  1. Make sure that the CP0 (co-processor 0) STATUS register has the BEV bit set. Otherwise things will fail seemingly inexplicably.
  2. Set the CP0 EBASE (exception base) register to use the exception handler address. This changes the target of the jump that occurs when an exception or interrupt occurs.
  3. Set the CP0 INTCTL (interrupt control) register to use a non-zero vector spacing, even though this setting probably has no relevance to the single-vector mode.
  4. Make sure that the CP0 CAUSE (exception cause) register has the IV bit set. This tells the CPU that the interrupt handler is at that magic 0×200 offset from the exception handler, meaning that interrupts should be dispatched there instead of to the exception handler.
  5. Now make sure that the CP0 STATUS register has the BEV bit cleared, so that the CPU will now use the new handlers.

To enable exceptions and interrupts, the IE bit in the CP0 STATUS register must be set, but there are also other things that must be done for them to actually be delivered.

Recommendation: Handle the Initial Error Condition

As I found out with the Ben NanoNote, a MIPS device will happily run in its initial state, but it turns out that this state is some kind of “error” state that prevents exceptions and interrupts from being delivered, even though interrupts may have been enabled. This does make some kind of sense: if a program is in the process of setting up interrupt handlers, it doesn’t really want an interrupt to occur before that work is done.

So, the MIPS architecture defines some flags in the CP0 STATUS register to override normal behaviour as some kind of “fail-safe” controls. The ERL (error level) bit is set when the CPU starts up, preventing errors (or probably most errors, at least) as well as interrupts from interrupting the execution of the installed code. The EXL (exception level) bit may also be set, preventing exceptions and interrupts from occurring even when ERL is clear. Thus, both of these bits must be cleared for interrupts to be enabled and delivered, and what happens next rather depends on how successful we were in setting up those handlers!

Recommendation: Set up the Global Offset Table

Something that I first noticed when looking at code for the Ben NanoNote was the initialisation of a register to refer to something called the global offset table (or GOT). For anyone already familiar with MIPS assembly language or the way code is compiled for the architecture, programs follow a convention where they refer to objects via a table containing the addresses of those objects. For example:

Global Offset Table Offset from Start Member +0 0 +4 address of _start +8 address of my_routine +12 address of other_routine … …

But in order to refer to the table, something has to have the address of the table. This is where the $gp register comes in: it holds that address and lets the code access members of the table relative to the address.

What seems to work for setting up the $gp register is this:

        lui $gp, %hi(_GLOBAL_OFFSET_TABLE_)
        ori $gp, $gp, %lo(_GLOBAL_OFFSET_TABLE_)

Here, the upper 16 bits of the $gp register are set with the “high” 16 bits of the value assigned to the special _GLOBAL_OFFSET_TABLE_ symbol, which provides the address of the special .got section defined in the linker script. This value is then combined using a logical “or” operation with the “low” 16 bits of the symbol’s value.

(I’m sure that anyone reading this far will already know that MIPS instructions are fixed length and are the same length as the address being loaded here, so it isn’t possible to fit the address into the load instruction. So each half of the address has to be loaded separately by different instructions. If you look at the assembled output of an assembly language program employing the “li” instruction, you will see that the assembler breaks this instruction down into “lui” and “ori” if it needs to. Note well that you cannot rely on the “la” instruction to load the special symbol into $gp precisely because it relies on $gp having already been set up.)

Rounding Off

I could probably go on about MIPS initialisation rituals, setting up a stack for function calls, and so on, but that isn’t really what this article is about. My intention here is to leave enough clues and reminders for other people in a similar position and perhaps even my future self.

Even though I have some familiarity with the MIPS architecture, I suppose you might be wondering why I am not evaluating more open hardware platforms. I admit that it is partly laziness: I could get into doing FPGA-related stuff and deploy open microcontroller cores, maybe even combining them with different peripheral circuits, but that would be a longer project with a lot of familiarisation involved, plus I would have to choose carefully to get a product supported by Free Software. It is also cheapness: I could have ordered a HiFive1 board and started experimenting with the RISC-V architecture, but that board seems expensive for what it offers, at least once you disregard the pioneering aspects of the product and focus on the applications of interest.

So, for now, I intend to move slowly forwards and gain experiences with an existing platform. In my next article on this topic, I hope to present some of the things I have managed to achieve with the PIC32 and a selection of different components and technologies.

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!

Schweinderl