> the actual argument value the JVM receives is "/jars/*", and in turn decides to be helpful, and expand the wildcard anyway
Whenever I see such things, I immediately think "whatever the resulting order is, it had better not matter"; and if it does, which is definitely true for Java classpaths, I consider it a bug that needs to be fixed ASAP, before it causes what happened in the article.
Yeah, my jaw was dropping as I realized how far they went with this from checking their mount options to reading ext4 source code. Directly order is almost always an implementation detail (I'm pretty sure it is on ext4) and even if it isn't you still shouldn't rely on it (for when someone decides to migrate your production machines to BTRFS because they want snapshots and now your app has some weird breakage). The problem is that the app depends on directory order, and you need to fix that, not figure out how you can predict the directoy order.
Maybe there should be a mount options to randomize directory order that people can use in their staging environments.
> Maybe there should be a mount options to randomize directory order that people can use in their staging environments.
The behavior I've witnessed suggests that the the order is based on inode numbering, which is initially sequential from creation time, and drifts semi-randomly as inodes are unlinked and reused. I don't know this for a fact, but it makes enough sense. Directory ordering should be assumed to be random in all cases, as you suggest.
Since duplicates on the classpath don't cause problems, a quick & dirty fix is to manually list versioned jars first, in order, then the jars/* argument.
> there was a client library that needed a Bouncy Castle “provider” with a version “jdk15”+ as the client initialization used specific properties from a class, and those properties were only available in “jdk15”+.
> up until the node image update, we “fortunately” had node images with directory hash seeds ordering “jdk15” or “jdk18” before “jdk14”.
So the actual bug is that something needing jdk15+ should either retry or be deterministically fed a valid file, right? And this whole article is figuring out why the filesystem coincidentally masked it by accidentally always happening to hand it a file with what it needed?
Actually, no, that "15" refers to Java 1.5, aka Java 5, released 2004. Bouncy Castle has some funky variants, specially for Java 1.1, 1.2, 1.3, 1.4, 5, 6, 7, 8. All you actually need is the Bouncy Castle for Java 8 onwards, which is pretty much all versions of Java in use today.
The bug is that multiple providers of Bouncy Castle don't cleanly work when in the classpath together. The authors of Bouncy Castle aren't changing that, because they're like "use our software correctly, please". It's not Java's fault, you can only make classes that don't work on old versions of the JDK, you can't make new Java somehow notice you've included a jar written specifically for an old version of the JVM.
Java did introduce the ability to create multi-release jar files, where you can have JDK-version-specific classes/resources in one jar file... but only from Java 9 onwards. All this mixing and matching by filename that Bouncy Castle uses is for Java 1.1 - Java 1.8 only.
You can also mix and match and cause failure by using one of the Bouncy Castle JCE provider variants with the wrong corresponding "pkix", "util", "mail" jars (extra jars for all the things you might want to do with cryptography that _aren't_ part of the standardised Java Cryptography Extensions API that the main "provider" jar implements). And you can also mess up by mixing FIPS-approved BC with FIPS-not-approved BC.
You only need one set of jars:
* If you don't need FIPS approval: bcprov-jdk18on, bcutil-jdk18on, bcpkix-jdk18on, ...
* If you do: bc-fips, bcutil-fils, bcpkix-fips, ...
If you read files in the same order they are on disk (often, the order in which they were written, which readdir on modern filesystems should choose to produce), I/O is much faster.
It's worth noticing that the performance difference between sequential and non-sequential reads will differ significantly between types of devices. It's much more noticeable on a spinning hard disk drive than it is on a solid-state drive.
It was referenced in the article as "the orange site" however the reason for it initially being named as such is probably because of HNs system of trying to avoid popularity being artificially driven high. The details of this is as far as I know pretty scarce, but the idea is that if you try to get to the top of Hackernews they somehow detect that and penalize you. So people have taken to calling it "the orange site" in order to avoid this detection when talking about HN.
Not parent, but yes. 100% yes. It loads quickly, has great content density, lacks tons of JavaScript that tanks performance on slower machines, reminds me of the older, better times. For the same reason, many people still prefer the old Reddit UI compared to the new UI.
Oh, right. Well, that's easy: it's an orange site.
I've seen it typically (though not universally) used seemingly dismissively, so I've always assumed it was a euphemism. People very commonly refrain from naming a thing directly if they disapprove.
Always fun when code relies on the order of iterating over a dir (which is in general clearly not defined to have any order, even iterating the same dir 2th consecutively might not yield the same order depending on "stuff" (e.g. exact file system used)).
So if order matters, always sort.
(Luckily in most situations where dir iter order matters, the performance impact from sorting is acceptable or even outright irrelevant.)
Looks like it's -U (capital U). But I just tried it and it still took several seconds for the first filename to appear. It was not the spinning up of the disk because I first did ls in the parent folder which was immediate. The second time I did ls on the large folder, though, it was fast (even without -U).
> It was not the spinning up of the disk because I first did ls in the parent folder which was immediate
That doesn't tell you anything; the parent's dentries could have been cached days ago and still present, meaning it didn't actually access the disk or cause it to be spun up (if it wasn't) at all.
When doing any kind of repeatable measurement or experimentation on disks you will want to drop the page cache every time first:
Well, I just booted the computer ;) But you are right, dropping the cache is probably a better way.
I have a folder with 5500 subfolders. Doing "ls -U" in that folder (after dropping the page cache like above) takes 50 seconds (!) And the dir entries appear all at once, i.e. not in a streaming way.
Its parent folder only contains 6 subfolders. Doing a cache drop followed by "ls -U" gives immediate results.
How to investigate this further? (Using an Ubuntu 18.04 system)
It's probably that ls in the other shell is the builtin instead of the binary, when strace runs ls -U it does run the binary and not the shell builtin. tcsh must also delegate to the binary instead of a builtin, or their builtin is faster.
Yeah, I haven't figured it out yet, but when I do /bin/ls, then indeed the problem does not show. Probably a case of Bash trying to be smarter than it needs to be.
The production fix is don't include 3 versions of the same dependency in the image build (use "bcprov-jdk18on" and don't use any other "bcprov")
Another fix can be to use a fat jar (containing your software and all its dependencies), but this doesn't work for Bouncy Castle, because Cryptography Is Special(TM), and Java won't load cryptography providers unless their jars are signed, and including the cryptography provider jar in the far jar means it loses its signature.
> The production fix is don't include 3 versions of the same dependency in the image build (use "bcprov-jdk18on" and don't use any other "bcprov")
I doubt anyone is doing that manually, that’s probably done by mvn/gradle/sbt/whatever the cool Java kids use these days. Do the build tools not know about this problem and just make a mess?
It's Bouncy Castle's particular situation. The Java build tools are totally fine with resolving thousands of version dependencies so everyone is happy. You can depend on A which in turn depends on B version 1.2 and also depend on C which depends on D which depends on E version 1.1 and you only end up with one version of B included, version 1.2. Java execution environments also support all kinds of classloader isolation so you have multiple versions of the same jar and classes, all in the same JVM, only visible to the components that wanted to see them, so there's no clash.
But Bouncy Castle - and almost nothing else - adds another dimension across its artifact names. This is not standard! You now have to watch your dependency trees like a hawk to see that some other artifact doesn't bring in <artifactId>bcprov-jdk14</artifactId> to fuck with your <artifactId>bcprov-jdk18on</artifactId>, and if they do, you need to slap an <exclusion> on that dependency's dependency.
The reason Bouncy Castle does this is because it chooses to support some very old versions of Java, that predate JDK 9 introducing multi-release jars (https://docs.oracle.com/en/java/javase/21/docs/specs/jar/jar...) which removed the need for different named jars for different JDKs (...but only from JDK 9 onwards)
So, in general, the Java tools have this solved, unless you're Bouncy Castle.
I'm feeling like an old man now but who the hell calls a tool "buildah"? Especially with its ugly dog logo. You can almost assume the dog wants to say "builder" but the extra flaps of skin makes the sound distorted
At least it is search engine friendly. Recently had to search for code snippets for the 30 year old "expect" tool. Was rather difficult and I thought, well the Web is younger than that tool, they could not imagine a search engine. Hint: "expect script" seemed to work decently well.
Looks like it's a silly and self-aware play on the word "builder" (New England regional dialect):
> Since I’m relatively new to the world of containers and images, I was excited to learn about the Buildah tool. Especially since I’m a native New Englander and it’s a clever play on how we say Builder in these parts. [0]
I've been using no capitalization on short messages in chat for more than 20 years (and still do), but an entire article written in the style makes it harder to read. It's funny that the author believes in syntax highlighting for code readability but not capitalization for English readability.
Perhaps I'd become effortlessly fluent in Aramaic if I had to read enough articles in it, but absent some substantial benefit I'd prefer to keep with standard English.
Those people are so short sighted. I put two 0's in front because I really care about humanity. This, I believe, will help fix climate change. Excuse me while I sniff my own farts.
I mean "buildah" is at least searchable (imagine trying to look for a build tool called "builder"). The lack of capitalization doesn't have any positive side-effects, apart from saving your shift key some use..
Since it's a name I'm fine with it. That is actually some people's pronounciation, even if no one's spelling, but I have no problem taking them seriously since they are not simply putting annoying affectation into writing, it's a name. Names have to be distinct, and they don't have to be cute but it's also not exactly damning either.
All that said, probably wouldn't have been my choice either.
It's weird. I personally wouldn't want quite such a silly name for that particular kind of tool, but that is a funny thing for me to say because I was never one of the people who wanted to remove the swear words from the kernel because "professional impression". Don't ask me to explain it.
I mean branding logo for a this kind of tool really doesn't matter and if so why should you hire a graphic designer to do that for you if you already have something which is passable.
You can read it as build-ah, ah is in some languages the word for the sound people make when they have a insight/light bulb moment. It might also just be a coincidence, idk.
But most importantly it's nicely searchable word, it's memorable too, it's pronounceable and it's somewhat related to what it does (a "build" tool).
So in all the metrics which matter it's a good name.
> the actual argument value the JVM receives is "/jars/*", and in turn decides to be helpful, and expand the wildcard anyway
Whenever I see such things, I immediately think "whatever the resulting order is, it had better not matter"; and if it does, which is definitely true for Java classpaths, I consider it a bug that needs to be fixed ASAP, before it causes what happened in the article.
Yeah, my jaw was dropping as I realized how far they went with this from checking their mount options to reading ext4 source code. Directly order is almost always an implementation detail (I'm pretty sure it is on ext4) and even if it isn't you still shouldn't rely on it (for when someone decides to migrate your production machines to BTRFS because they want snapshots and now your app has some weird breakage). The problem is that the app depends on directory order, and you need to fix that, not figure out how you can predict the directoy order.
Maybe there should be a mount options to randomize directory order that people can use in their staging environments.
> Maybe there should be a mount options to randomize directory order that people can use in their staging environments.
The behavior I've witnessed suggests that the the order is based on inode numbering, which is initially sequential from creation time, and drifts semi-randomly as inodes are unlinked and reused. I don't know this for a fact, but it makes enough sense. Directory ordering should be assumed to be random in all cases, as you suggest.
Also, command line strings are limited to 128kiB on Linux: https://unix.stackexchange.com/questions/120642/what-defines...
Counting on the order of files to support multiple versions of jars was never a good idea. Java does have multiple version ("release") jar files for your use case since java 9. See <https://docs.oracle.com/en/java/javase/24/docs/api/java.base...>.
Since duplicates on the classpath don't cause problems, a quick & dirty fix is to manually list versioned jars first, in order, then the jars/* argument.
> there was a client library that needed a Bouncy Castle “provider” with a version “jdk15”+ as the client initialization used specific properties from a class, and those properties were only available in “jdk15”+.
> up until the node image update, we “fortunately” had node images with directory hash seeds ordering “jdk15” or “jdk18” before “jdk14”.
So the actual bug is that something needing jdk15+ should either retry or be deterministically fed a valid file, right? And this whole article is figuring out why the filesystem coincidentally masked it by accidentally always happening to hand it a file with what it needed?
> something needing jdk15+
Actually, no, that "15" refers to Java 1.5, aka Java 5, released 2004. Bouncy Castle has some funky variants, specially for Java 1.1, 1.2, 1.3, 1.4, 5, 6, 7, 8. All you actually need is the Bouncy Castle for Java 8 onwards, which is pretty much all versions of Java in use today.
The bug is that multiple providers of Bouncy Castle don't cleanly work when in the classpath together. The authors of Bouncy Castle aren't changing that, because they're like "use our software correctly, please". It's not Java's fault, you can only make classes that don't work on old versions of the JDK, you can't make new Java somehow notice you've included a jar written specifically for an old version of the JVM.
Java did introduce the ability to create multi-release jar files, where you can have JDK-version-specific classes/resources in one jar file... but only from Java 9 onwards. All this mixing and matching by filename that Bouncy Castle uses is for Java 1.1 - Java 1.8 only.
You can also mix and match and cause failure by using one of the Bouncy Castle JCE provider variants with the wrong corresponding "pkix", "util", "mail" jars (extra jars for all the things you might want to do with cryptography that _aren't_ part of the standardised Java Cryptography Extensions API that the main "provider" jar implements). And you can also mess up by mixing FIPS-approved BC with FIPS-not-approved BC.
You only need one set of jars:
* If you don't need FIPS approval: bcprov-jdk18on, bcutil-jdk18on, bcpkix-jdk18on, ...
* If you do: bc-fips, bcutil-fils, bcpkix-fips, ...
It does matter for performance.
If you read files in the same order they are on disk (often, the order in which they were written, which readdir on modern filesystems should choose to produce), I/O is much faster.
Order of files listed in a directory need not match the order of the bytes saved in the physical media.
It's worth noticing that the performance difference between sequential and non-sequential reads will differ significantly between types of devices. It's much more noticeable on a spinning hard disk drive than it is on a solid-state drive.
On spinning rust, sure. That does not hold for SSDs (which most consumer-grade computers have now).
You’d still miss out on some potential prefetch cache misses
Literally condemn any computer that still comes new from the factory with spinning rust. I was using SSDs back in 2012.
Build tools supporting duplicate class detection have existed for… well a long time. Ignore them at your own peril.
The orange site discusses the article in the first footnote here: https://news.ycombinator.com/item?id=43573507
Why "the orange site"?
It was referenced in the article as "the orange site" however the reason for it initially being named as such is probably because of HNs system of trying to avoid popularity being artificially driven high. The details of this is as far as I know pretty scarce, but the idea is that if you try to get to the top of Hackernews they somehow detect that and penalize you. So people have taken to calling it "the orange site" in order to avoid this detection when talking about HN.
how about it being a simple gentle nod to the plain design of HN.
stop idolizing HN. look at the privacy policy and tell me if it still looks appealing to you
Not parent, but yes. 100% yes. It loads quickly, has great content density, lacks tons of JavaScript that tanks performance on slower machines, reminds me of the older, better times. For the same reason, many people still prefer the old Reddit UI compared to the new UI.
See the first sentence of the article!
?
> the title is a cheeky reference to something at the front page of the orange site today
Yes, that's what I'm asking. Why do people refer to HN as "the orange site"?
Oh, right. Well, that's easy: it's an orange site.
I've seen it typically (though not universally) used seemingly dismissively, so I've always assumed it was a euphemism. People very commonly refrain from naming a thing directly if they disapprove.
Disclaimer: I'm no mind reader.
Always fun when code relies on the order of iterating over a dir (which is in general clearly not defined to have any order, even iterating the same dir 2th consecutively might not yield the same order depending on "stuff" (e.g. exact file system used)).
So if order matters, always sort.
(Luckily in most situations where dir iter order matters, the performance impact from sorting is acceptable or even outright irrelevant.)
By the way, max hardlink count for ext4 seems configured ridiculously low for modern standards, at least on Ubuntu.
"ls" can take ages on a large folder. Is there a way to make it more immediate, i.e. streaming output without sorting?
It's something like ls -u from memory.
Looks like it's -U (capital U). But I just tried it and it still took several seconds for the first filename to appear. It was not the spinning up of the disk because I first did ls in the parent folder which was immediate. The second time I did ls on the large folder, though, it was fast (even without -U).
> It was not the spinning up of the disk because I first did ls in the parent folder which was immediate
That doesn't tell you anything; the parent's dentries could have been cached days ago and still present, meaning it didn't actually access the disk or cause it to be spun up (if it wasn't) at all.
When doing any kind of repeatable measurement or experimentation on disks you will want to drop the page cache every time first:
Well, I just booted the computer ;) But you are right, dropping the cache is probably a better way.
I have a folder with 5500 subfolders. Doing "ls -U" in that folder (after dropping the page cache like above) takes 50 seconds (!) And the dir entries appear all at once, i.e. not in a streaming way.
Its parent folder only contains 6 subfolders. Doing a cache drop followed by "ls -U" gives immediate results.
How to investigate this further? (Using an Ubuntu 18.04 system)
strace can tell you what system calls it's doing, what the results are, and how long they're taking, which may help narrow it down.
Thanks. Interestingly, strace speeds up the operation. What took 50s after a cache drop now becomes immediate with "strace -f ls -U".
Dropping the cache and doing "time ls -U" gives:
Update: never mind, it appears to be something in my shell. Switching to tcsh completely eradicated the problem.It's probably that ls in the other shell is the builtin instead of the binary, when strace runs ls -U it does run the binary and not the shell builtin. tcsh must also delegate to the binary instead of a builtin, or their builtin is faster.
Yeah, I haven't figured it out yet, but when I do /bin/ls, then indeed the problem does not show. Probably a case of Bash trying to be smarter than it needs to be.
Yes you're right, it's -U on linux (1). On Mac its -f (2). Linux also has -f, which is equivalent to -a -U .
(1) https://man7.org/linux/man-pages/man1/ls.1.html
(2) https://ss64.com/mac/ls.html
Could they have avoided that issue by specifying the classpath without the star?
So -cp /jars/ instead of -cp /jars/*?
So what was the production fix? Surely you're not hex-editing the image until the end of time?
The production fix is don't include 3 versions of the same dependency in the image build (use "bcprov-jdk18on" and don't use any other "bcprov")
Another fix can be to use a fat jar (containing your software and all its dependencies), but this doesn't work for Bouncy Castle, because Cryptography Is Special(TM), and Java won't load cryptography providers unless their jars are signed, and including the cryptography provider jar in the far jar means it loses its signature.
> The production fix is don't include 3 versions of the same dependency in the image build (use "bcprov-jdk18on" and don't use any other "bcprov")
I doubt anyone is doing that manually, that’s probably done by mvn/gradle/sbt/whatever the cool Java kids use these days. Do the build tools not know about this problem and just make a mess?
It's Bouncy Castle's particular situation. The Java build tools are totally fine with resolving thousands of version dependencies so everyone is happy. You can depend on A which in turn depends on B version 1.2 and also depend on C which depends on D which depends on E version 1.1 and you only end up with one version of B included, version 1.2. Java execution environments also support all kinds of classloader isolation so you have multiple versions of the same jar and classes, all in the same JVM, only visible to the components that wanted to see them, so there's no clash.
But Bouncy Castle - and almost nothing else - adds another dimension across its artifact names. This is not standard! You now have to watch your dependency trees like a hawk to see that some other artifact doesn't bring in <artifactId>bcprov-jdk14</artifactId> to fuck with your <artifactId>bcprov-jdk18on</artifactId>, and if they do, you need to slap an <exclusion> on that dependency's dependency.
The reason Bouncy Castle does this is because it chooses to support some very old versions of Java, that predate JDK 9 introducing multi-release jars (https://docs.oracle.com/en/java/javase/21/docs/specs/jar/jar...) which removed the need for different named jars for different JDKs (...but only from JDK 9 onwards)
So, in general, the Java tools have this solved, unless you're Bouncy Castle.
great article
I'm feeling like an old man now but who the hell calls a tool "buildah"? Especially with its ugly dog logo. You can almost assume the dog wants to say "builder" but the extra flaps of skin makes the sound distorted
At least it is search engine friendly. Recently had to search for code snippets for the 30 year old "expect" tool. Was rather difficult and I thought, well the Web is younger than that tool, they could not imagine a search engine. Hint: "expect script" seemed to work decently well.
(.. or a search on https://pkgs.org to surface metadata)
Looks like it's a silly and self-aware play on the word "builder" (New England regional dialect):
> Since I’m relatively new to the world of containers and images, I was excited to learn about the Buildah tool. Especially since I’m a native New Englander and it’s a clever play on how we say Builder in these parts. [0]
[0] https://buildah.io/blogs/2017/06/22/introducing-buildah.html
That is correct. The person who largely had overall responsibility for Red Hat’s open source container tooling is a Boston area native.
Much like the choice to stop using language features like capitalization, it’s part of the current cultural trend.
Kinda like Buildly or Buildr. It’s cool until it’s your turn to be old. Then you look back and wince.
I've been using no capitalization on short messages in chat for more than 20 years (and still do), but an entire article written in the style makes it harder to read. It's funny that the author believes in syntax highlighting for code readability but not capitalization for English readability.
> but an entire article written in the style makes it harder to read
That's purely a familiarity effect; it's a self-solving problem.
Perhaps I'd become effortlessly fluent in Aramaic if I had to read enough articles in it, but absent some substantial benefit I'd prefer to keep with standard English.
You would, but it would take a while. Reading slightly different letter forms is more of a matter of hours.
> it’s part of the current cultural trend
Is it, or it's just a niche just like people who write 5 digit years, putting a 0 in front?
It's still very rare to encounter any of those.
Is it a current trend? my Mom does this and she picked it up in the 70's on typewriters.
Those people are so short sighted. I put two 0's in front because I really care about humanity. This, I believe, will help fix climate change. Excuse me while I sniff my own farts.
I mean "buildah" is at least searchable (imagine trying to look for a build tool called "builder"). The lack of capitalization doesn't have any positive side-effects, apart from saving your shift key some use..
> who the hell calls a tool "buildah"?
Bostonians? :P
Since it's a name I'm fine with it. That is actually some people's pronounciation, even if no one's spelling, but I have no problem taking them seriously since they are not simply putting annoying affectation into writing, it's a name. Names have to be distinct, and they don't have to be cute but it's also not exactly damning either.
All that said, probably wouldn't have been my choice either.
It's weird. I personally wouldn't want quite such a silly name for that particular kind of tool, but that is a funny thing for me to say because I was never one of the people who wanted to remove the swear words from the kernel because "professional impression". Don't ask me to explain it.
> hell calls a tool "buildah"?
people who seem to have done a pretty good job
I mean branding logo for a this kind of tool really doesn't matter and if so why should you hire a graphic designer to do that for you if you already have something which is passable.
You can read it as build-ah, ah is in some languages the word for the sound people make when they have a insight/light bulb moment. It might also just be a coincidence, idk.
But most importantly it's nicely searchable word, it's memorable too, it's pronounceable and it's somewhat related to what it does (a "build" tool).
So in all the metrics which matter it's a good name.
I like the dog logo. Thanks for calling attention to it, I now have something to ghiblify.
ext4 has no checksums, integrity checks, etc. it will silently corrupt your data and you wouldn't even know about it. switch to btrfs, it's way better
fun read, now i want to learn about overlays