I’ve been struggling with something that sounds simple but is surprisingly annoying:
capturing content quickly across devices in a self-hosted environment.
On Android there’s share, on iOS shortcuts, on desktop copy/paste… but everything feels fragmented.
I often end up losing things or postponing them just because capturing isn’t frictionless.
Curious how others handle this.


On my phone, my Screenshot folder is syncthing’d to my desktop, so most of the time, capturing something in the moment is as simple as dragging three fingers down my screen. My Camera and default Download folders are also syncthing’d, so just taking a picture or saving something from a browser has it captured across my devices.
I also use Tududi, which has Telegram integration, for the quick note. Taking the note is just a matter of sending a message in Telegram, which is available on all my devices. Signal’s “Note To Self” feature is also useful; I trust it more than Telegram for sensitive data. In Firefox on my desktop, I have “Automatic Tab Opener” (Browser extension) pulling up my Tududi inbox every hour, reminding me to actually deal with the notes I have previously taken.
That’s actually a really solid setup.
What always got me personally is exactly that — over time I’d end up with multiple “entry points” depending on context (screenshot, chat, browser, notes…).
Each one works, but I’d still need to mentally switch between them depending on what I’m capturing.
I kept wishing for something where the entry point is always the same, no matter the context.
So long as you’re manually processing everything, screenshots work for all of that. You can take a note in any text box anywhere, and screenshot it. Chat message? Screenshot. Browser? Screenshot. Notes? Screenshot. You can even take a photo and then screenshot it to capture it into your workflow.
I have Shutter (apt install shutter) on my desktop, and I’ve changed the Print Screen key to shortcut to “shutter -s”. This lets me capture an area of my screen with one button (and a mouse drag). Bam, more screenshot.
The downsides of screenshot are obvious, of course: Extracting the text from the screenshot is a bit of a pain in the ass. If you really want to keep the same entry point, though, you could setup a script to OCR newly captured screenshot/photos to extract the text. An OCR-friendly font might make that pretty reliable.
Now I want to improve my setup…
That’s actually a really interesting direction — using screenshots as a universal entry point.
It kind of shows how strong the need is for a single capture flow, even if it means bending everything into one format.
What always stopped me there is exactly what you mentioned — once you need OCR, scripts, post-processing… it starts adding friction again.
I kept wondering if the entry point could stay just as universal, but without needing to transform the content first.
The screenshot folder itself is certainly not limited to just screenshots. Any file you can save can be kept in there. To my mind, the “entry point” is “saving a file to this particular folder”, regardless of the specific method used to do the saving. The screenshot is just an extremely convenient way to do that.
I just thought of a way to improve this technique with Tasker. Tasker can work with the clipboard, edit files, and take a screenshot. So, you could set up a gesture to trigger a task in Tasker. Tasker can then take the screenshot, dumping it into the folder. Tasker can then check the clipboard; if there is text in your clipboard, it can prepend it to a single “TODO.txt” in your screenshot folder.
Linux could be configured much the same way, using shutter and xclip to capture the screenshot and clipboard, respectively.
Yeah that makes sense — treating a folder as the universal entry point is a clever way to unify things.
I think that’s exactly the direction: trying to reduce everything to a single “drop zone”.
Where I personally kept feeling friction is that you still need something in between to get things into that folder (scripts, gestures, automations, etc.).
So the entry point becomes “save to this folder”, but the way you get there still depends on context.
That’s the part I always found hard to make truly uniform.
Let’s try this a different way…
How do you want to indicate something should be retained? What is the single, physical act you want to perform to tell the operating system “this thing needs to be captured”?
That’s a really good way to frame it.
I kept coming back to the idea that the “act” shouldn’t be something new you have to learn — it should reuse what you’re already doing in each context.
So instead of one single physical gesture, it’s more like a single intent expressed through different native actions:
The key (for me) wasn’t forcing one gesture, but making all of those feel like the same action underneath.
So the mental model becomes: “this goes into my inbox”, regardless of how I triggered it.
That’s where things started to click for me.