hydrus/help/downloader_intro.html

<html>
	<head>
		<title>downloader - intro</title>
		<link href="hydrus.ico" rel="shortcut icon" />
		<link href="style.css" rel="stylesheet" type="text/css" />
	</head>
	<body>
		<div class="content">
			<p class="warning">This system and help is all under construction! Even when it is done, it will be for advanced users who understand HTML or JSON. Beware!</p>
			<h3>this system</h3>
			<p>The first versions of hydrus's downloaders were all hardcoded and static--I wrote everything into the program itself and nothing was user-creatable or -fixable. After the maintenance burden of the entire messy system proved too large for me to keep up with and a semi-editable booru system proved successful, I decided to significantly overhaul the whole thing. The new system allows user creation and sharing of every component. It is designed to be very simple to the front-end user--they will typically handle a couple of png files and then select a new downloader from a list--but very flexible (and hence pretty complicated) on the back-end. These help pages describe the different compontents with the intention of making an HTML- or JSON- fluent user able to create and share a full new downloader on their own.</p>
			<p>As always, this is all under active development. Your feedback on the system would be appreciated, and if something is confusing or you discover something in here that is out of date, please <a href="contact.html">let me know</a>.</p>
			<h3>what is a downloader?</h3>
			<p>In hydrus, a downloader is one of:</p>
			<ul>
				<li><h3>Gallery Downloader</h3></li>
				<li>This takes a string like 'blue_eyes' to produce a series of thumbnail gallery pages URLs that can be parsed for image page URLs which can ultimately be parsed for file URLs and metadata like tags. Boorus fall into this category.</li>
				<li><h3>Thread Watcher</h3></li>
				<li>This takes a URL that it will check repeatedly, parsing it for new URLs that it then queues up to be downloaded. It typically stops checking after the 'file velocity' (such as '1 new file per day') drops below a certain level.</li>
				<li><h3>Single Page Downloader</h3></li>
				<li>This takes a URL one-time and parses it for more URLs. This is a miscellaneous system for certain simple gallery types. The 'page of images' downloader is one of these.</li>
			</ul>
			<p>The system currently supports HTML and JSON parsing.</p>
			<h3>what does a downloader do?</h3>
			<p>As an example, in order for hydrus to convert our 'blue_eyes' query into a bunch of files with tags, it needs to:</p>
			<ul>
				<li>Present some user interface named 'Safebooru Downloader' to the user that will convert their input of 'blue_eyes' into <a href="https://safebooru.org/index.php?page=post&s=list&tags=blue_eyes&pid=0">https://safebooru.org/index.php?page=post&s=list&tags=blue_eyes&pid=0</a>.</li>
				<li>Recognise <a href="https://safebooru.org/index.php?page=post&s=list&tags=blue_eyes&pid=0">https://safebooru.org/index.php?page=post&s=list&tags=blue_eyes&pid=0</a> as a Safebooru Gallery URL.</li>
				<li>Convert the HTML of a Safebooru Gallery URL into a list URLs like <a href="https://safebooru.org/index.php?page=post&s=view&id=2437965">https://safebooru.org/index.php?page=post&s=view&id=2437965</a> and possibly a 'next page' URL that points to another page of thumbnails.</li>
				<li>Recognise <a href="https://safebooru.org/index.php?page=post&s=view&id=2437965">https://safebooru.org/index.php?page=post&s=view&id=2437965</a> as a Safebooru Post URL.</li>
				<li>Convert the HTML of a Safebooru Post URL into a file URL like <a href="https://safebooru.org//images/2329/b6e8c263d691d1c39a2eeba5e00709849d8f864d.jpg">https://safebooru.org//images/2329/b6e8c263d691d1c39a2eeba5e00709849d8f864d.jpg</a> and some tags like: 1girl, bangs, black gloves, blonde hair, blue eyes, braid, closed mouth, day, fingerless gloves, fingernails, gloves, grass, hair ornament, hairclip, hands clasped, creator:hankuri, interlocked fingers, long hair, long sleeves, outdoors, own hands together, parted bangs, pointy ears, character:princess zelda, smile, solo, series:the legend of zelda, underbust.</li>
			</ul>
			<p>So we have three components:</p>
			<ul>
				<li><b>Search:</b> faces the user and converts text input into a series of Gallery URLs.</li>
				<li><b>URL Class:</b> identifies URLs and informs the client how to deal with them.</li>
				<li><b>Parser:</b> converts data from URLs into hydrus-understandable metadata.</li>
			</ul>
			<p>Thread watchers and single page downloaders do not need the 'Search' component, as the input in this case <i>is</i> a URL. You drop an imageboard thread URL on the client and it automatically recognises what it is, launches a thread watcher page for it, and finds the correct parser for the output.</p>
			<p class="right"><a href="downloader_url_classes.html">Let's learn about URL Classes ----></a></p>
		</div>
	</body>
</html>