]> code.delx.au - gnu-emacs/blob - doc/misc/url.texi
Add 2009 to copyright years.
[gnu-emacs] / doc / misc / url.texi
1 \input texinfo
2 @setfilename ../../info/url
3 @settitle URL Programmer's Manual
4
5 @iftex
6 @c @finalout
7 @end iftex
8 @c @setchapternewpage odd
9 @c @smallbook
10
11 @tex
12 \overfullrule=0pt
13 %\global\baselineskip 30pt % for printing in double space
14 @end tex
15 @dircategory World Wide Web
16 @dircategory GNU Emacs Lisp
17 @direntry
18 * URL: (url). URL loading package.
19 @end direntry
20
21 @copying
22 This file documents the URL loading package.
23
24 Copyright @copyright{} 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2002,
25 2004, 2005, 2006, 2007, 2008, 2009 Free Software Foundation, Inc.
26
27 @quotation
28 Permission is granted to copy, distribute and/or modify this document
29 under the terms of the GNU Free Documentation License, Version 1.3 or
30 any later version published by the Free Software Foundation; with no
31 Invariant Sections, with the Front-Cover texts being ``A GNU Manual,''
32 and with the Back-Cover Texts as in (a) below. A copy of the license
33 is included in the section entitled ``GNU Free Documentation License''.
34
35 (a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
36 modify this GNU manual. Buying copies from the FSF supports it in
37 developing GNU and promoting software freedom.''
38 @end quotation
39 @end copying
40
41 @c
42 @titlepage
43 @title URL Programmer's Manual
44 @subtitle First Edition, URL Version 2.0
45 @author William M. Perry @email{wmperry@@gnu.org}
46 @author David Love @email{fx@@gnu.org}
47 @page
48 @vskip 0pt plus 1filll
49 @insertcopying
50 @end titlepage
51
52 @page
53 @node Top
54 @top URL
55
56
57 @menu
58 * Getting Started:: Preparing your program to use URLs.
59 * Retrieving URLs:: How to use this package to retrieve a URL.
60 * Supported URL Types:: Descriptions of URL types currently supported.
61 * Defining New URLs:: How to define a URL loader for a new protocol.
62 * General Facilities:: URLs can be cached, accessed via a gateway
63 and tracked in a history list.
64 * Customization:: Variables you can alter.
65 * GNU Free Documentation License:: The license for this documentation.
66 * Function Index::
67 * Variable Index::
68 * Concept Index::
69 @end menu
70
71 @node Getting Started
72 @chapter Getting Started
73 @cindex URLs, definition
74 @cindex URIs
75
76 @dfn{Uniform Resource Locators} (URLs) are a specific form of
77 @dfn{Uniform Resource Identifiers} (URI) described in RFC 2396 which
78 updates RFC 1738 and RFC 1808. RFC 2016 defines uniform resource
79 agents.
80
81 URIs have the form @var{scheme}:@var{scheme-specific-part}, where the
82 @var{scheme}s supported by this library are described below.
83 @xref{Supported URL Types}.
84
85 FTP, NFS, HTTP, HTTPS, @code{rlogin}, @code{telnet}, tn3270,
86 IRC and gopher URLs all have the form
87
88 @example
89 @var{scheme}://@r{[}@var{userinfo}@@@r{]}@var{hostname}@r{[}:@var{port}@r{]}@r{[}/@var{path}@r{]}
90 @end example
91 @noindent
92 where @samp{@r{[}} and @samp{@r{]}} delimit optional parts.
93 @var{userinfo} sometimes takes the form @var{username}:@var{password}
94 but you should beware of the security risks of sending cleartext
95 passwords. @var{hostname} may be a domain name or a dotted decimal
96 address. If the @samp{:@var{port}} is omitted then the library will
97 use the `well known' port for that service when accessing URLs. With
98 the possible exception of @code{telnet}, it is rare for ports to be
99 specified, and it is possible using a non-standard port may have
100 undesired consequences if a different service is listening on that
101 port (e.g., an HTTP URL specifying the SMTP port can cause mail to be
102 sent). @c , but @xref{Other Variables, url-bad-port-list}.
103 The meaning of the @var{path} component depends on the service.
104
105 @menu
106 * Configuration::
107 * Parsed URLs:: URLs are parsed into vector structures.
108 @end menu
109
110 @node Configuration
111 @section Configuration
112
113 @defvar url-configuration-directory
114 @cindex @file{~/.url}
115 @cindex configuration files
116 The directory in which URL configuration files, the cache etc.,
117 reside. Default @file{~/.url}.
118 @end defvar
119
120 @node Parsed URLs
121 @section Parsed URLs
122 @cindex parsed URLs
123 The library functions typically operate on @dfn{parsed} versions of
124 URLs. These are actually vectors of the form:
125
126 @example
127 [@var{type} @var{user} @var{password} @var{host} @var{port} @var{file} @var{target} @var{attributes} @var{full}]
128 @end example
129
130 @noindent where
131 @table @var
132 @item type
133 is the type of the URL scheme, e.g., @code{http}
134 @item user
135 is the username associated with it, or @code{nil};
136 @item password
137 is the user password associated with it, or @code{nil};
138 @item host
139 is the host name associated with it, or @code{nil};
140 @item port
141 is the port number associated with it, or @code{nil};
142 @item file
143 is the `file' part of it, or @code{nil}. This doesn't necessarily
144 actually refer to a file;
145 @item target
146 is the target part, or @code{nil};
147 @item attributes
148 is the attributes associated with it, or @code{nil};
149 @item full
150 is @code{t} for a fully-specified URL, with a host part indicated by
151 @samp{//} after the scheme part.
152 @end table
153
154 @findex url-type
155 @findex url-user
156 @findex url-password
157 @findex url-host
158 @findex url-port
159 @findex url-file
160 @findex url-target
161 @findex url-attributes
162 @findex url-full
163 @findex url-set-type
164 @findex url-set-user
165 @findex url-set-password
166 @findex url-set-host
167 @findex url-set-port
168 @findex url-set-file
169 @findex url-set-target
170 @findex url-set-attributes
171 @findex url-set-full
172 These attributes have accessors named @code{url-@var{part}}, where
173 @var{part} is the name of one of the elements above, e.g.,
174 @code{url-host}. Similarly, there are setters of the form
175 @code{url-set-@var{part}}.
176
177 There are functions for parsing and unparsing between the string and
178 vector forms.
179
180 @defun url-generic-parse-url url
181 Return a parsed version of the string @var{url}.
182 @end defun
183
184 @defun url-recreate-url url
185 @cindex unparsing URLs
186 Recreates a URL string from the parsed @var{url}.
187 @end defun
188
189 @node Retrieving URLs
190 @chapter Retrieving URLs
191
192 @defun url-retrieve-synchronously url
193 Retrieve @var{url} synchronously and return a buffer containing the
194 data. @var{url} is either a string or a parsed URL structure. Return
195 @code{nil} if there are no data associated with it (the case for dired,
196 info, or mailto URLs that need no further processing).
197 @end defun
198
199 @defun url-retrieve url callback &optional cbargs
200 Retrieve @var{url} asynchronously and call @var{callback} with args
201 @var{cbargs} when finished. The callback is called when the object
202 has been completely retrieved, with the current buffer containing the
203 object and any MIME headers associated with it. @var{url} is either a
204 string or a parsed URL structure. Returns the buffer @var{url} will
205 load into, or @code{nil} if the process has already completed.
206 @end defun
207
208 @node Supported URL Types
209 @chapter Supported URL Types
210
211 @menu
212 * http/https:: Hypertext Transfer Protocol.
213 * file/ftp:: Local files and FTP archives.
214 * info:: Emacs `Info' pages.
215 * mailto:: Sending email.
216 * news/nntp/snews:: Usenet news.
217 * rlogin/telnet/tn3270:: Remote host connectivity.
218 * irc:: Internet Relay Chat.
219 * data:: Embedded data URLs.
220 * nfs:: Networked File System
221 @c * finger::
222 @c * gopher::
223 @c * netrek::
224 @c * prospero::
225 * cid:: Content-ID.
226 * about::
227 * ldap:: Lightweight Directory Access Protocol
228 * imap:: IMAP mailboxes.
229 * man:: Unix man pages.
230 @end menu
231
232 @node http/https
233 @section @code{http} and @code{https}
234
235 The scheme @code{http} is Hypertext Transfer Protocol. The library
236 supports version 1.1, specified in RFC 2616. (This supersedes 1.0,
237 defined in RFC 1945) HTTP URLs have the following form, where most of
238 the parts are optional:
239 @example
240 http://@var{user}:@var{password}@@@var{host}:@var{port}/@var{path}?@var{searchpart}#@var{fragment}
241 @end example
242 @c The @code{:@var{port}} part is optional, and @var{port} defaults to
243 @c 80. The @code{/@var{path}} part, if present, is a slash-separated
244 @c series elements. The @code{?@var{searchpart}}, if present, is the
245 @c query for a search or the content of a form submission. The
246 @c @code{#fragment} part, if present, is a location in the document.
247
248 The scheme @code{https} is a secure version of @code{http}, with
249 transmission via SSL. It is defined in RFC 2069. Its default port is
250 443. This scheme depends on SSL support in Emacs via the
251 @file{ssl.el} library and is actually implemented by forcing the
252 @code{ssl} gateway method to be used. @xref{Gateways in general}.
253
254 @defopt url-honor-refresh-requests
255 This controls honoring of HTTP @samp{Refresh} headers by which
256 servers can direct clients to reload documents from the same URL or a
257 or different one. @code{nil} means they will not be honored,
258 @code{t} (the default) means they will always be honored, and
259 otherwise the user will be asked on each request.
260 @end defopt
261
262
263 @menu
264 * Cookies::
265 * HTTP language/coding::
266 * HTTP URL Options::
267 * Dealing with HTTP documents::
268 @end menu
269
270 @node Cookies
271 @subsection Cookies
272
273 @defopt url-cookie-file
274 The file in which cookies are stored, defaulting to @file{cookies} in
275 the directory specified by @code{url-configuration-directory}.
276 @end defopt
277
278 @defopt url-cookie-confirmation
279 Specifies whether confirmation is require to accept cookies.
280 @end defopt
281
282 @defopt url-cookie-multiple-line
283 Specifies whether to put all cookies for the server on one line in the
284 HTTP request to satisfy broken servers like
285 @url{http://www.hotmail.com}.
286 @end defopt
287
288 @defopt url-cookie-trusted-urls
289 A list of regular expressions matching URLs from which to accept
290 cookies always.
291 @end defopt
292
293 @defopt url-cookie-untrusted-urls
294 A list of regular expressions matching URLs from which to reject
295 cookies always.
296 @end defopt
297
298 @defopt url-cookie-save-interval
299 The number of seconds between automatic saves of cookies to disk.
300 Default is one hour.
301 @end defopt
302
303
304 @node HTTP language/coding
305 @subsection Language and Encoding Preferences
306
307 HTTP allows clients to express preferences for the language and
308 encoding of documents which servers may honor. For each of these
309 variables, the value is a string; it can specify a single choice, or
310 it can be a comma-separated list.
311
312 Normally this list ordered by descending preference. However, each
313 element can be followed by @samp{;q=@var{priority}} to specify its
314 preference level, a decimal number from 0 to 1; e.g., for
315 @code{url-mime-language-string}, @w{@code{"de, en-gb;q=0.8,
316 en;q=0.7"}}. An element that has no @samp{;q} specification has
317 preference level 1.
318
319 @defopt url-mime-charset-string
320 @cindex character sets
321 @cindex coding systems
322 This variable specifies a preference for character sets when documents
323 can be served in more than one encoding.
324
325 HTTP allows specifying a series of MIME charsets which indicate your
326 preferred character set encodings, e.g., Latin-9 or Big5, and these
327 can be weighted. The default series is generated automatically from
328 the associated MIME types of all defined coding systems, sorted by the
329 coding system priority specified in Emacs. @xref{Recognize Coding, ,
330 Recognizing Coding Systems, emacs, The GNU Emacs Manual}.
331 @end defopt
332
333 @defopt url-mime-language-string
334 @cindex language preferences
335 A string specifying the preferred language when servers can serve
336 files in several languages. Use RFC 1766 abbreviations, e.g.,
337 @samp{en} for English, @samp{de} for German.
338
339 The string can be @code{"*"} to get the first available language (as
340 opposed to the default).
341 @end defopt
342
343 @node HTTP URL Options
344 @subsection HTTP URL Options
345
346 HTTP supports an @samp{OPTIONS} method describing things supported by
347 the URL@.
348
349 @defun url-http-options url
350 Returns a property list describing options available for URL. The
351 property list members are:
352
353 @table @code
354 @item methods
355 A list of symbols specifying what HTTP methods the resource
356 supports.
357
358 @item dav
359 @cindex DAV
360 A list of numbers specifying what DAV protocol/schema versions are
361 supported.
362
363 @item dasl
364 @cindex DASL
365 A list of supported DASL search types supported (string form).
366
367 @item ranges
368 A list of the units available for use in partial document fetches.
369
370 @item p3p
371 @cindex P3P
372 The @dfn{Platform For Privacy Protection} description for the resource.
373 Currently this is just the raw header contents.
374 @end table
375
376 @end defun
377
378 @node Dealing with HTTP documents
379 @subsection Dealing with HTTP documents
380
381 HTTP URLs are retrieved into a buffer containing the HTTP headers
382 followed by the body. Since the headers are quasi-MIME, they may be
383 processed using the MIME library. @xref{Top,, Emacs MIME,
384 emacs-mime, The Emacs MIME Manual}. The URL package provides a
385 function to do this in general:
386
387 @defun url-decode-text-part handle &optional coding
388 This function decodes charset-encoded text in the current buffer. In
389 Emacs, the buffer is expected to be unibyte initially and is set to
390 multibyte after decoding.
391 HANDLE is the MIME handle of the original part. CODING is an explicit
392 coding to use, overriding what the MIME headers specify.
393 The coding system used for the decoding is returned.
394
395 Note that this function doesn't deal with @samp{http-equiv} charset
396 specifications in HTML @samp{<meta>} elements.
397 @end defun
398
399 @node file/ftp
400 @section file and ftp
401 @cindex files
402 @cindex FTP
403 @cindex File Transfer Protocol
404 @cindex compressed files
405 @cindex dired
406
407 @example
408 ftp://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
409 file://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
410 @end example
411
412 These schemes are defined in RFC 1808.
413 @samp{ftp:} and @samp{file:} are synonymous in this library. They
414 allow reading arbitrary files from hosts. Either @samp{ange-ftp}
415 (Emacs) or @samp{efs} (XEmacs) is used to retrieve them from remote
416 hosts. Local files are accessed directly.
417
418 Compressed files are handled, but support is hard-coded so that
419 @code{jka-compr-compression-info-list} and so on have no affect.
420 Suffixes recognized are @samp{.z}, @samp{.gz}, @samp{.Z} and
421 @samp{.bz2}.
422
423 @defopt url-directory-index-file
424 The filename to look for when indexing a directory, default
425 @samp{"index.html"}. If this file exists, and is readable, then it
426 will be viewed instead of using @code{dired} to view the directory.
427 @end defopt
428
429 @node info
430 @section info
431 @cindex Info
432 @cindex Texinfo
433 @findex Info-goto-node
434
435 @example
436 info:@var{file}#@var{node}
437 @end example
438
439 Info URLs are not officially defined. They invoke
440 @code{Info-goto-node} with argument @samp{(@var{file})@var{node}}.
441 @samp{#@var{node}} is optional, defaulting to @samp{Top}.
442
443 @node mailto
444 @section mailto
445
446 @cindex mailto
447 @cindex email
448 A mailto URL will send an email message to the address in the
449 URL, for example @samp{mailto:foo@@bar.com} would compose a
450 message to @samp{foo@@bar.com}.
451
452 @defopt url-mail-command
453 @vindex mail-user-agent
454 The function called whenever url needs to send mail. This should
455 normally be left to default from @var{mail-user-agent}. @xref{Mail
456 Methods, , Mail-Composition Methods, emacs, The GNU Emacs Manual}.
457 @end defopt
458
459 An @samp{X-Url-From} header field containing the URL of the document
460 that contained the mailto URL is added if that URL is known.
461
462 RFC 2368 extends the definition of mailto URLs in RFC 1738.
463 The form of a mailto URL is
464 @example
465 @samp{mailto:@var{mailbox}[?@var{header}=@var{contents}[&@var{header}=@var{contents}]]}
466 @end example
467 @noindent where an arbitrary number of @var{header}s can be added. If the
468 @var{header} is @samp{body}, then @var{contents} is put in the body
469 otherwise a @var{header} header field is created with @var{contents}
470 as its contents. Note that the URL library does not consider any
471 headers `dangerous' so you should check them before sending the
472 message.
473
474 @c Fixme: update
475 Email messages are defined in @sc{rfc}822.
476
477 @node news/nntp/snews
478 @section @code{news}, @code{nntp} and @code{snews}
479 @cindex news
480 @cindex network news
481 @cindex usenet
482 @cindex NNTP
483 @cindex snews
484
485 @c draft-gilman-news-url-01
486 The network news URL scheme take the following forms following RFC
487 1738 except that for compatibility with other clients, host and port
488 fields may be included in news URLs though they are properly only
489 allowed for nntp an snews.
490
491 @table @samp
492 @item news:@var{newsgroup}
493 Retrieves a list of messages in @var{newsgroup};
494 @item news:@var{message-id}
495 Retrieves the message with the given @var{message-id};
496 @item news:*
497 Retrieves a list of all available newsgroups;
498 @item nntp://@var{host}:@var{port}/@var{newsgroup}
499 @itemx nntp://@var{host}:@var{port}/@var{message-id}
500 @itemx nntp://@var{host}:@var{port}/*
501 Similar to the @samp{news} versions.
502 @end table
503
504 @samp{:@var{port}} is optional and defaults to :119.
505
506 @samp{snews} is the same as @samp{nntp} except that the default port
507 is :563.
508 @cindex SSL
509 (It is tunneled through SSL.)
510
511 An @samp{nntp} URL is the same as a news URL, except that the URL may
512 specify an article by its number.
513
514 @defopt url-news-server
515 This variable can be used to override the default news server.
516 Usually this will be set by the Gnus package, which is used to fetch
517 news.
518 @cindex environment variable
519 @vindex NNTPSERVER
520 It may be set from the conventional environment variable
521 @code{NNTPSERVER}.
522 @end defopt
523
524 @node rlogin/telnet/tn3270
525 @section rlogin, telnet and tn3270
526 @cindex rlogin
527 @cindex telnet
528 @cindex tn3270
529 @cindex terminal emulation
530 @findex terminal-emulator
531
532 These URL schemes from RFC 1738 for logon via a terminal emulator have
533 the form
534 @example
535 telnet://@var{user}:@var{password}@@@var{host}:@var{port}
536 @end example
537 but the @code{:@var{password}} component is ignored.
538
539 To handle rlogin, telnet and tn3270 URLs, a @code{rlogin},
540 @code{telnet} or @code{tn3270} (the program names and arguments are
541 hardcoded) session is run in a @code{terminal-emulator} buffer.
542 Well-known ports are used if the URL does not specify a port.
543
544 @node irc
545 @section irc
546 @cindex IRC
547 @cindex Internet Relay Chat
548 @cindex ZEN IRC
549 @cindex ERC
550 @cindex rcirc
551 @c Fixme: reference (was http://www.w3.org/Addressing/draft-mirashi-url-irc-01.txt)
552 @dfn{Internet Relay Chat} (IRC) is handled by handing off the @sc{irc}
553 session to a function named in @code{url-irc-function}.
554
555 @defopt url-irc-function
556 A function to actually open an IRC connection.
557 This function
558 must take five arguments, @var{host}, @var{port}, @var{channel},
559 @var{user} and @var{password}. The @var{channel} argument specifies the
560 channel to join immediately, this can be @code{nil}. By default this is
561 @code{url-irc-rcirc}.
562 @end defopt
563 @defun url-irc-rcirc host port channel user password
564 Processes the arguments and lets @code{rcirc} handle the session.
565 @end defun
566 @defun url-irc-erc host port channel user password
567 Processes the arguments and lets @code{ERC} handle the session.
568 @end defun
569 @defun url-irc-zenirc host port channel user password
570 Processes the arguments and lets @code{zenirc} handle the session.
571 @end defun
572
573 @node data
574 @section data
575 @cindex data URLs
576
577 @example
578 data:@r{[}@var{media-type}@r{]}@r{[};@var{base64}@r{]},@var{data}
579 @end example
580
581 Data URLs contain MIME data in the URL itself. They are defined in
582 RFC 2397.
583
584 @var{media-type} is a MIME @samp{Content-Type} string, possibly
585 including parameters. It defaults to
586 @samp{text/plain;charset=US-ASCII}. The @samp{text/plain} can be
587 omitted but the charset parameter supplied. If @samp{;base64} is
588 present, the @var{data} are base64-encoded.
589
590 @node nfs
591 @section nfs
592 @cindex NFS
593 @cindex Network File System
594 @cindex automounter
595
596 @example
597 nfs://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
598 @end example
599
600 The @samp{nfs:} scheme is defined in RFC 2224. It is similar to
601 @samp{ftp:} except that it points to a file on a remote host that is
602 handled by the automounter on the local host.
603
604 @defvar url-nfs-automounter-directory-spec
605 @end defvar
606 A string saying how to invoke the NFS automounter. Certain @samp{%}
607 sequences are recognized:
608
609 @table @samp
610 @item %h
611 The hostname of the NFS server;
612 @item %n
613 The port number of the NFS server;
614 @item %u
615 The username to use to authenticate;
616 @item %p
617 The password to use to authenticate;
618 @item %f
619 The filename on the remote server;
620 @item %%
621 A literal @samp{%}.
622 @end table
623
624 Each can be used any number of times.
625
626 @node cid
627 @section cid
628 @cindex Content-ID
629
630 RFC 2111
631
632 @node about
633 @section about
634
635 @node ldap
636 @section ldap
637 @cindex LDAP
638 @cindex Lightweight Directory Access Protocol
639
640 The LDAP scheme is defined in RFC 2255.
641
642 @node imap
643 @section imap
644 @cindex IMAP
645
646 RFC 2192
647
648 @node man
649 @section man
650 @cindex @command{man}
651 @cindex Unix man pages
652 @findex man
653
654 @example
655 @samp{man:@var{page-spec}}
656 @end example
657
658 This is a non-standard scheme. @var{page-spec} is passed directly to
659 the Lisp @code{man} function.
660
661 @node Defining New URLs
662 @chapter Defining New URLs
663
664 @menu
665 * Naming conventions::
666 * Required functions::
667 * Optional functions::
668 * Asynchronous fetching::
669 * Supporting file-name-handlers::
670 @end menu
671
672 @node Naming conventions
673 @section Naming conventions
674
675 @node Required functions
676 @section Required functions
677
678 @node Optional functions
679 @section Optional functions
680
681 @node Asynchronous fetching
682 @section Asynchronous fetching
683
684 @node Supporting file-name-handlers
685 @section Supporting file-name-handlers
686
687 @node General Facilities
688 @chapter General Facilities
689
690 @menu
691 * Disk Caching::
692 * Proxies::
693 * Gateways in general::
694 * History::
695 @end menu
696
697 @node Disk Caching
698 @section Disk Caching
699 @cindex Caching
700 @cindex Persistent Cache
701 @cindex Disk Cache
702
703 The disk cache stores retrieved documents locally, whence they can be
704 retrieved more quickly. When requesting a URL that is in the cache,
705 the library checks to see if the page has changed since it was last
706 retrieved from the remote machine. If not, the local copy is used,
707 saving the transmission over the network.
708 @cindex Cleaning the cache
709 @cindex Clearing the cache
710 @cindex Cache cleaning
711 Currently the cache isn't cleared automatically.
712 @c Running the @code{clean-cache} shell script
713 @c fist is recommended, to allow for future cleaning of the cache. This
714 @c shell script will remove all files that have not been accessed since it
715 @c was last run. To keep the cache pared down, it is recommended that this
716 @c script be run from @i{at} or @i{cron} (see the manual pages for
717 @c crontab(5) or at(1) for more information)
718
719 @defopt url-automatic-caching
720 Setting this variable non-@code{nil} causes documents to be cached
721 automatically.
722 @end defopt
723
724 @defopt url-cache-directory
725 This variable specifies the
726 directory to store the cache files. It defaults to sub-directory
727 @file{cache} of @code{url-configuration-directory}.
728 @end defopt
729
730 @c Fixme: function v. option, but neither used.
731 @c @findex url-cache-expired
732 @c @defopt url-cache-expired
733 @c This is a function to decide whether or not a cache entry has expired.
734 @c It takes two times as it parameters and returns non-@code{nil} if the
735 @c second time is ``too old'' when compared with the first time.
736 @c @end defopt
737
738 @defopt url-cache-creation-function
739 The cache relies on a scheme for mapping URLs to files in the cache.
740 This variable names a function which sets the type of cache to use.
741 It takes a URL as argument and returns the absolute file name of the
742 corresponding cache file. The two supplied possibilities are
743 @code{url-cache-create-filename-using-md5} and
744 @code{url-cache-create-filename-human-readable}.
745 @end defopt
746
747 @defun url-cache-create-filename-using-md5 url
748 Creates a cache file name from @var{url} using MD5 hashing.
749 This is creates entries with very few cache collisions and is fast.
750 @cindex MD5
751 @smallexample
752 (url-cache-create-filename-using-md5 "http://www.example.com/foo/bar")
753 @result{} "/home/fx/.url/cache/fx/http/com/example/www/b8a35774ad20db71c7c3409a5410e74f"
754 @end smallexample
755 @end defun
756
757 @defun url-cache-create-filename-human-readable url
758 Creates a cache file name from @var{url} more obviously connected to
759 @var{url} than for @code{url-cache-create-filename-using-md5}, but
760 more likely to conflict with other files.
761 @smallexample
762 (url-cache-create-filename-human-readable "http://www.example.com/foo/bar")
763 @result{} "/home/fx/.url/cache/fx/http/com/example/www/foo/bar"
764 @end smallexample
765 @end defun
766
767 @c Fixme: never actually used currently?
768 @c @defopt url-standalone-mode
769 @c @cindex Relying on cache
770 @c @cindex Cache only mode
771 @c @cindex Standalone mode
772 @c If this variable is non-@code{nil}, the library relies solely on the
773 @c cache for fetching documents and avoids checking if they have changed
774 @c on remote servers.
775 @c @end defopt
776
777 @c With a large cache of documents on the local disk, it can be very handy
778 @c when traveling, or any other time the network connection is not active
779 @c (a laptop with a dial-on-demand PPP connection, etc). Emacs/W3 can rely
780 @c solely on its cache, and avoid checking to see if the page has changed
781 @c on the remote server. In the case of a dial-on-demand PPP connection,
782 @c this will keep the phone line free as long as possible, only bringing up
783 @c the PPP connection when asking for a page that is not located in the
784 @c cache. This is very useful for demonstrations as well.
785
786 @node Proxies
787 @section Proxies and Gatewaying
788
789 @c fixme: check/document url-ns stuff
790 @cindex proxy servers
791 @cindex proxies
792 @cindex environment variables
793 @vindex HTTP_PROXY
794 Proxy servers are commonly used to provide gateways through firewalls
795 or as caches serving some more-or-less local network. Each protocol
796 (HTTP, FTP, etc.)@: can have a different gateway server. Proxying is
797 conventionally configured commonly amongst different programs through
798 environment variables of the form @code{@var{protocol}_proxy}, where
799 @var{protocol} is one of the supported network protocols (@code{http},
800 @code{ftp} etc.). The library recognizes such variables in either
801 upper or lower case. Their values are of one of the forms:
802 @itemize @bullet
803 @item @code{@var{host}:@var{port}}
804 @item A full URL;
805 @item Simply a host name.
806 @end itemize
807
808 @vindex NO_PROXY
809 The @code{NO_PROXY} environment variable specifies URLs that should be
810 excluded from proxying (on servers that should be contacted directly).
811 This should be a comma-separated list of hostnames, domain names, or a
812 mixture of both. Asterisks can be used as wildcards, but other
813 clients may not support that. Domain names may be indicated by a
814 leading dot. For example:
815 @example
816 NO_PROXY="*.aventail.com,home.com,.seanet.com"
817 @end example
818 @noindent says to contact all machines in the @samp{aventail.com} and
819 @samp{seanet.com} domains directly, as well as the machine named
820 @samp{home.com}. If @code{NO_PROXY} isn't defined, @code{no_PROXY}
821 and @code{no_proxy} are also tried, in that order.
822
823 Proxies may also be specified directly in Lisp.
824
825 @defopt url-proxy-services
826 This variable is an alist of URL schemes and proxy servers that
827 gateway them. The items are of the form @w{@code{(@var{scheme}
828 . @var{host}:@var{portnumber})}}, says that the URL @var{scheme} is
829 gatewayed through @var{portnumber} on the specified @var{host}. An
830 exception is the pseudo scheme @code{"no_proxy"}, which is paired with
831 a regexp matching host names not to be proxied. This variable is
832 initialized from the environment as above.
833
834 @example
835 (setq url-proxy-services
836 '(("http" . "proxy.aventail.com:80")
837 ("no_proxy" . "^.*\\(aventail\\|seanet\\)\\.com")))
838 @end example
839 @end defopt
840
841 @node Gateways in general
842 @section Gateways in General
843 @cindex gateways
844 @cindex firewalls
845
846 The library provides a general gateway layer through which all
847 networking passes. It can both control access to the network and
848 provide access through gateways in firewalls. This may make direct
849 connections in some cases and pass through some sort of gateway in
850 others.@footnote{Proxies (which only operate over HTTP) are
851 implemented using this.} The library's basic function responsible for
852 making connections is @code{url-open-stream}.
853
854 @defun url-open-stream name buffer host service
855 @cindex opening a stream
856 @cindex stream, opening
857 Open a stream to @var{host}, possibly via a gateway. The other
858 arguments are as for @code{open-network-stream}. This will not make a
859 connection if @code{url-gateway-unplugged} is non-@code{nil}.
860 @end defun
861
862 @defvar url-gateway-local-host-regexp
863 This is a regular expression that matches local hosts that do not
864 require the use of a gateway. If @code{nil}, all connections are made
865 through the gateway.
866 @end defvar
867
868 @defvar url-gateway-method
869 This variable controls which gateway method is used. It may be useful
870 to bind it temporarily in some applications. It has values taken from
871 a list of symbols. Possible values are:
872
873 @table @code
874 @item telnet
875 @cindex @command{telnet}
876 Use this method if you must first telnet and log into a gateway host,
877 and then run telnet from that host to connect to outside machines.
878
879 @item rlogin
880 @cindex @command{rlogin}
881 This method is identical to @code{telnet}, but uses @command{rlogin}
882 to log into the remote machine without having to send the username and
883 password over the wire every time.
884
885 @item socks
886 @cindex @sc{socks}
887 Use if the firewall has a @sc{socks} gateway running on it. The
888 @sc{socks} v5 protocol is defined in RFC 1928.
889
890 @c @item ssl
891 @c This probably shouldn't be documented
892 @c Fixme: why not? -- fx
893
894 @item native
895 This method uses Emacs's builtin networking directly. This is the
896 default. It can be used only if there is no firewall blocking access.
897 @end table
898 @end defvar
899
900 The following variables control the gateway methods.
901
902 @defopt url-gateway-telnet-host
903 The gateway host to telnet to. Once logged in there, you then telnet
904 out to the hosts you want to connect to.
905 @end defopt
906 @defopt url-gateway-telnet-parameters
907 This should be a list of parameters to pass to the @command{telnet} program.
908 @end defopt
909 @defopt url-gateway-telnet-password-prompt
910 This is a regular expression that matches the password prompt when
911 logging in.
912 @end defopt
913 @defopt url-gateway-telnet-login-prompt
914 This is a regular expression that matches the username prompt when
915 logging in.
916 @end defopt
917 @defopt url-gateway-telnet-user-name
918 The username to log in with.
919 @end defopt
920 @defopt url-gateway-telnet-password
921 The password to send when logging in.
922 @end defopt
923 @defopt url-gateway-prompt-pattern
924 This is a regular expression that matches the shell prompt.
925 @end defopt
926
927 @defopt url-gateway-rlogin-host
928 Host to @samp{rlogin} to before telnetting out.
929 @end defopt
930 @defopt url-gateway-rlogin-parameters
931 Parameters to pass to @samp{rsh}.
932 @end defopt
933 @defopt url-gateway-rlogin-user-name
934 User name to use when logging in to the gateway.
935 @end defopt
936 @defopt url-gateway-prompt-pattern
937 This is a regular expression that matches the shell prompt.
938 @end defopt
939
940 @defopt socks-server
941 This specifies the default server, it takes the form
942 @w{@code{("Default server" @var{server} @var{port} @var{version})}}
943 where @var{version} can be either 4 or 5.
944 @end defopt
945 @defvar socks-password
946 If this is @code{nil} then you will be asked for the password,
947 otherwise it will be used as the password for authenticating you to
948 the @sc{socks} server.
949 @end defvar
950 @defvar socks-username
951 This is the username to use when authenticating yourself to the
952 @sc{socks} server. By default this is your login name.
953 @end defvar
954 @defvar socks-timeout
955 This controls how long, in seconds, to wait for responses from the
956 @sc{socks} server; it is 5 by default.
957 @end defvar
958 @c fixme: these have been effectively commented-out in the code
959 @c @defopt socks-server-aliases
960 @c This a list of server aliases. It is a list of aliases of the form
961 @c @var{(alias hostname port version)}.
962 @c @end defopt
963 @c @defopt socks-network-aliases
964 @c This a list of network aliases. Each entry in the list takes the form
965 @c @var{(alias (network))} where @var{alias} is a string that names the
966 @c @var{network}. The networks can contain a pair (not a dotted pair) of
967 @c @sc{ip} addresses which specify a range of @sc{ip} addresses, an @sc{ip}
968 @c address and a netmask, a domain name or a unique hostname or @sc{ip}
969 @c address.
970 @c @end defopt
971 @c @defopt socks-redirection-rules
972 @c This a list of redirection rules. Each rule take the form
973 @c @var{(Destination network Connection type)} where @var{Destination
974 @c network} is a network alias from @code{socks-network-aliases} and
975 @c @var{Connection type} can be @code{nil} in which case a direct
976 @c connection is used, or it can be an alias from
977 @c @code{socks-server-aliases} in which case that server is used as a
978 @c proxy.
979 @c @end defopt
980 @defopt socks-nslookup-program
981 @cindex @command{nslookup}
982 This the @samp{nslookup} program. It is @code{"nslookup"} by default.
983 @end defopt
984
985 @menu
986 * Suppressing network connections::
987 @end menu
988 @c * Broken hostname resolution::
989
990 @node Suppressing network connections
991 @subsection Suppressing Network Connections
992
993 @cindex network connections, suppressing
994 @cindex suppressing network connections
995 @cindex bugs, HTML
996 @cindex HTML `bugs'
997 In some circumstances it is desirable to suppress making network
998 connections. A typical case is when rendering HTML in a mail user
999 agent, when external URLs should not be activated, particularly to
1000 avoid `bugs' which `call home' by fetch single-pixel images and the
1001 like. To arrange this, bind the following variable for the duration
1002 of such processing.
1003
1004 @defvar url-gateway-unplugged
1005 If this variable is non-@code{nil} new network connections are never
1006 opened by the URL library.
1007 @end defvar
1008
1009 @c @node Broken hostname resolution
1010 @c @subsection Broken Hostname Resolution
1011
1012 @c @cindex hostname resolver
1013 @c @cindex resolver, hostname
1014 @c Some C libraries do not include the hostname resolver routines in
1015 @c their static libraries. If Emacs was linked statically, and was not
1016 @c linked with the resolver libraries, it will not be able to get to any
1017 @c machines off the local network. This is characterized by being able
1018 @c to reach someplace with a raw ip number, but not its hostname
1019 @c (@url{http://129.79.254.191/} works, but
1020 @c @url{http://www.cs.indiana.edu/} doesn't). This used to happen on
1021 @c SunOS4 and Ultrix, but is now probably now rare. If Emacs can't be
1022 @c rebuilt linked against the resolver library, it can use the external
1023 @c @command{nslookup} program instead.
1024
1025 @c @defopt url-gateway-broken-resolution
1026 @c @cindex @code{nslookup} program
1027 @c @cindex program, @code{nslookup}
1028 @c If non-@code{nil}, this variable says to use the program specified by
1029 @c @code{url-gateway-nslookup-program} program to do hostname resolution.
1030 @c @end defopt
1031
1032 @c @defopt url-gateway-nslookup-program
1033 @c The name of the program to do hostname lookup if Emacs can't do it
1034 @c directly. This program should expect a single argument on the command
1035 @c line---the hostname to resolve---and should produce output similar to
1036 @c the standard Unix @command{nslookup} program:
1037 @c @example
1038 @c Name: www.cs.indiana.edu
1039 @c Address: 129.79.254.191
1040 @c @end example
1041 @c @end defopt
1042
1043 @node History
1044 @section History
1045
1046 @findex url-do-setup
1047 The library can maintain a global history list tracking URLs accessed.
1048 URL completion can be done from it. The history mechanism is set up
1049 automatically via @code{url-do-setup} when it is configured to be on.
1050 Note that the size of the history list is currently not limited.
1051
1052 @vindex url-history-hash-table
1053 The history `list' is actually a hash table,
1054 @code{url-history-hash-table}. It contains access times keyed by URL
1055 strings. The times are in the format returned by @code{current-time}.
1056
1057 @defun url-history-update-url url time
1058 This function updates the history table with an entry for @var{url}
1059 accessed at the given @var{time}.
1060 @end defun
1061
1062 @defopt url-history-track
1063 If non-@code{nil}, the library will keep track of all the URLs
1064 accessed. If it is @code{t}, the list is saved to disk at the end of
1065 each Emacs session. The default is @code{nil}.
1066 @end defopt
1067
1068 @defopt url-history-file
1069 The file storing the history list between sessions. It defaults to
1070 @file{history} in @code{url-configuration-directory}.
1071 @end defopt
1072
1073 @defopt url-history-save-interval
1074 @findex url-history-setup-save-timer
1075 The number of seconds between automatic saves of the history list.
1076 Default is one hour. Note that if you change this variable directly,
1077 rather than using Custom, after @code{url-do-setup} has been run, you
1078 need to run the function @code{url-history-setup-save-timer}.
1079 @end defopt
1080
1081 @defun url-history-parse-history &optional fname
1082 Parses the history file @var{fname} (default @code{url-history-file})
1083 and sets up the history list.
1084 @end defun
1085
1086 @defun url-history-save-history &optional fname
1087 Saves the current history to file @var{fname} (default
1088 @code{url-history-file}).
1089 @end defun
1090
1091 @defun url-completion-function string predicate function
1092 You can use this function to do completion of URLs from the history.
1093 @end defun
1094
1095 @node Customization
1096 @chapter Customization
1097
1098 @section Environment Variables
1099
1100 @cindex environment variables
1101 The following environment variables affect the library's operation at
1102 startup.
1103
1104 @table @code
1105 @item TMPDIR
1106 @vindex TMPDIR
1107 @vindex url-temporary-directory
1108 If this is defined, @var{url-temporary-directory} is initialized from
1109 it.
1110 @end table
1111
1112 @section General User Options
1113
1114 The following user options, settable with Customize, affect the
1115 general operation of the package.
1116
1117 @defopt url-debug
1118 @cindex debugging
1119 Specifies the types of debug messages the library which are logged to
1120 the @code{*URL-DEBUG*} buffer.
1121 @code{t} means log all messages.
1122 A number means log all messages and show them with @code{message}.
1123 If may also be a list of the types of messages to be logged.
1124 @end defopt
1125 @defopt url-personal-mail-address
1126 @end defopt
1127 @defopt url-privacy-level
1128 @end defopt
1129 @defopt url-uncompressor-alist
1130 @end defopt
1131 @defopt url-passwd-entry-func
1132 @end defopt
1133 @defopt url-standalone-mode
1134 @end defopt
1135 @defopt url-bad-port-list
1136 @end defopt
1137 @defopt url-max-password-attempts
1138 @end defopt
1139 @defopt url-temporary-directory
1140 @end defopt
1141 @defopt url-show-status
1142 @end defopt
1143 @defopt url-confirmation-func
1144 The function to use for asking yes or no functions. This is normally
1145 either @code{y-or-n-p} or @code{yes-or-no-p}, but could be another
1146 function taking a single argument (the prompt) and returning @code{t}
1147 only if an affirmative answer is given.
1148 @end defopt
1149 @defopt url-gateway-method
1150 @c fixme: describe gatewaying
1151 A symbol specifying the type of gateway support to use for connections
1152 from the local machine. The supported methods are:
1153
1154 @table @code
1155 @item telnet
1156 Run telnet in a subprocess to connect;
1157 @item rlogin
1158 Rlogin to another machine to connect;
1159 @item socks
1160 Connect through a socks server;
1161 @item ssl
1162 Connect with SSL;
1163 @item native
1164 Connect directly.
1165 @end table
1166 @end defopt
1167
1168 @node GNU Free Documentation License
1169 @appendix GNU Free Documentation License
1170 @include doclicense.texi
1171
1172 @node Function Index
1173 @unnumbered Command and Function Index
1174 @printindex fn
1175
1176 @node Variable Index
1177 @unnumbered Variable Index
1178 @printindex vr
1179
1180 @node Concept Index
1181 @unnumbered Concept Index
1182 @printindex cp
1183
1184 @setchapternewpage odd
1185 @contents
1186 @bye
1187
1188 @ignore
1189 arch-tag: c96be356-7e2d-4196-bcda-b13246c5c3f0
1190 @end ignore