Danbooru

Bookmarklet possibly not getting original/best image from Tumblr

Posted under Bugs & Features

I think this might be related to issue #4401, but I'm not sure.

I've been trying to resolve the bad link and bad source on post #2785496 (current source: http://data.tumblr.com/92ae66364f43cf5e5e90879b100aaad4/tumblr_o2xl5iWmDn1v97ssmo1_raw.jpg, following this URL gives me a strange security error in Firefox, saying that data.tumblr.com is trying to use a security certificate meant for Amazon). By browsing the artist's tumblr page manually, I was able to locate the post here, with an image URL of https://66.media.tumblr.com/92ae66364f43cf5e5e90879b100aaad4/tumblr_o2xl5iWmDn1v97ssmo1_1280.jpg. Note that the bolded portions of the two URLs match.

Despite one having the "raw" suffix and the other having the "1280" suffix, both have the same resolution of 1153x800. I downloaded both to compare them and the one from tumblr is less than half the file size than the Danbooru copy (214 KB vs 525 KB). A closer inspection revealed the tumblr image has been subject to an additional jpg pass which has also made it progressive.

The strange thing is that if I use the bookmarklet on the tumblr post, the subsequent upload page indicates the image file size is 214 KB. It's possible that the version Danbooru already has is no longer available, which does happen. But it also seems possible that the bookmarklet is getting a resampled image with the same resolution as the original, instead of the original.

_raw posts are not accessible anymore, so all we can access is the biggest version they make available. And yes, tumblr compresses pictures hard.
Even if the original file is smaller than the biggest size allowed by tumblr, they'll still compress to save bandwidth, which is what happens with that post. The _1280 version is a compressed jpg of the _raw. Nothing we can do about it.

feline_lump said:

See topic #14154 for older discussion on _raw. There's still a (time-consuming) exploit to download them, but I'm not sure if there's any way that could be made compatible with the bookmarklet.

You'd basically have to create and export a new account for every post that you want to extract the raw version from, if you wanted to implement something like that on Danbooru. So they're practically not accessible by us.

Tried out exporting 100+ files, only took a few hours. I'll admit they weren't the largest images in the world.

It is completely impractical if you're racing for first post/upload cred. It does seem doable for rooting around in old posts, but I'm not sure anyone's really been bothering and I can see why. It's nice 'cause the original file is still sitting back there somewhere, but it's definitely some very slow hoops to jump through.

Sourcing's a little weird. I still got lots to learn about the booru.

nonamethanks said:

You'd basically have to create and export a new account for every post that you want to extract the raw version from, if you wanted to implement something like that on Danbooru. So they're practically not accessible by us.

One account can make multiple blogs and export them separately, if that speeds things up enough. I gotta admit it still seems like a pain in the neck.

1