]> code.delx.au - gnu-emacs/blob - man/url.texi
(Abbrevs): A @node line without explicit Prev, Next, and Up links.
[gnu-emacs] / man / url.texi
1 \input texinfo
2 @setfilename ../info/url
3 @settitle URL Programmer's Manual
4
5 @iftex
6 @c @finalout
7 @end iftex
8 @c @setchapternewpage odd
9 @c @smallbook
10
11 @tex
12 \overfullrule=0pt
13 %\global\baselineskip 30pt % for printing in double space
14 @end tex
15 @dircategory World Wide Web
16 @dircategory GNU Emacs Lisp
17 @direntry
18 * URL: (url). URL loading package.
19 @end direntry
20
21 @ifnottex
22 This file documents the URL loading package.
23
24 Copyright @copyright{} 1996, 1997, 1998, 1999, 2002, 2004,
25 2005, 2006 Free Software Foundation, Inc.@*
26 Copyright @copyright{} 1993, 1994, 1995, 1996 William M. Perry
27
28 Permission is granted to copy, distribute and/or modify this document
29 under the terms of the GNU Free Documentation License, Version 1.2 or
30 any later version published by the Free Software Foundation; with the
31 Invariant Sections being
32 ``GNU GENERAL PUBLIC LICENSE''. A copy of the
33 license is included in the section entitled ``GNU Free Documentation
34 License.''
35 @end ifnottex
36
37 @c
38 @titlepage
39 @sp 6
40 @center @titlefont{URL}
41 @center @titlefont{Programmer's Manual}
42 @sp 4
43 @center First Edition, URL Version 2.0
44 @sp 1
45 @c @center December 1999
46 @sp 5
47 @center William M. Perry
48 @center @email{wmperry@@gnu.org}
49 @center David Love
50 @center @email{fx@@gnu.org}
51 @page
52 @vskip 0pt plus 1filll
53 Copyright @copyright{} 1993, 1994, 1995, 1996 William M. Perry@*
54 Copyright @copyright{} 1996, 1997, 1998, 1999, 2002, 2003, 2004,
55 2005, 2006 Free Software Foundation, Inc.
56
57 Permission is granted to copy, distribute and/or modify this document
58 under the terms of the GNU Free Documentation License, Version 1.2 or
59 any later version published by the Free Software Foundation; with the
60 Invariant Sections being
61 ``GNU GENERAL PUBLIC LICENSE''. A copy of the
62 license is included in the section entitled ``GNU Free Documentation
63 License.''
64 @end titlepage
65 @page
66 @node Top
67 @top URL
68
69
70
71 @menu
72 * Getting Started:: Preparing your program to use URLs.
73 * Retrieving URLs:: How to use this package to retrieve a URL.
74 * Supported URL Types:: Descriptions of URL types currently supported.
75 * Defining New URLs:: How to define a URL loader for a new protocol.
76 * General Facilities:: URLs can be cached, accessed via a gateway
77 and tracked in a history list.
78 * Customization:: Variables you can alter.
79 * Function Index::
80 * Variable Index::
81 * Concept Index::
82 @end menu
83
84 @node Getting Started
85 @chapter Getting Started
86 @cindex URLs, definition
87 @cindex URIs
88
89 @dfn{Uniform Resource Locators} (URLs) are a specific form of
90 @dfn{Uniform Resource Identifiers} (URI) described in RFC 2396 which
91 updates RFC 1738 and RFC 1808. RFC 2016 defines uniform resource
92 agents.
93
94 URIs have the form @var{scheme}:@var{scheme-specific-part}, where the
95 @var{scheme}s supported by this library are described below.
96 @xref{Supported URL Types}.
97
98 FTP, NFS, HTTP, HTTPS, @code{rlogin}, @code{telnet}, tn3270,
99 IRC and gopher URLs all have the form
100
101 @example
102 @var{scheme}://@r{[}@var{userinfo}@@@r{]}@var{hostname}@r{[}:@var{port}@r{]}@r{[}/@var{path}@r{]}
103 @end example
104 @noindent
105 where @samp{@r{[}} and @samp{@r{]}} delimit optional parts.
106 @var{userinfo} sometimes takes the form @var{username}:@var{password}
107 but you should beware of the security risks of sending cleartext
108 passwords. @var{hostname} may be a domain name or a dotted decimal
109 address. If the @samp{:@var{port}} is omitted then the library will
110 use the `well known' port for that service when accessing URLs. With
111 the possible exception of @code{telnet}, it is rare for ports to be
112 specified, and it is possible using a non-standard port may have
113 undesired consequences if a different service is listening on that
114 port (e.g., an HTTP URL specifying the SMTP port can cause mail to be
115 sent). @c , but @xref{Other Variables, url-bad-port-list}.
116 The meaning of the @var{path} component depends on the service.
117
118 @menu
119 * Configuration::
120 * Parsed URLs:: URLs are parsed into vector structures.
121 @end menu
122
123 @node Configuration
124 @section Configuration
125
126 @defvar url-configuration-directory
127 @cindex @file{~/.url}
128 @cindex configuration files
129 The directory in which URL configuration files, the cache etc.,
130 reside. Default @file{~/.url}.
131 @end defvar
132
133 @node Parsed URLs
134 @section Parsed URLs
135 @cindex parsed URLs
136 The library functions typically operate on @dfn{parsed} versions of
137 URLs. These are actually vectors of the form:
138
139 @example
140 [@var{type} @var{user} @var{password} @var{host} @var{port} @var{file} @var{target} @var{attributes} @var{full}]
141 @end example
142
143 @noindent where
144 @table @var
145 @item type
146 is the type of the URL scheme, e.g., @code{http}
147 @item user
148 is the username associated with it, or @code{nil};
149 @item password
150 is the user password associated with it, or @code{nil};
151 @item host
152 is the host name associated with it, or @code{nil};
153 @item port
154 is the port number associated with it, or @code{nil};
155 @item file
156 is the `file' part of it, or @code{nil}. This doesn't necessarily
157 actually refer to a file;
158 @item target
159 is the target part, or @code{nil};
160 @item attributes
161 is the attributes associated with it, or @code{nil};
162 @item full
163 is @code{t} for a fully-specified URL, with a host part indicated by
164 @samp{//} after the scheme part.
165 @end table
166
167 @findex url-type
168 @findex url-user
169 @findex url-password
170 @findex url-host
171 @findex url-port
172 @findex url-file
173 @findex url-target
174 @findex url-attributes
175 @findex url-full
176 @findex url-set-type
177 @findex url-set-user
178 @findex url-set-password
179 @findex url-set-host
180 @findex url-set-port
181 @findex url-set-file
182 @findex url-set-target
183 @findex url-set-attributes
184 @findex url-set-full
185 These attributes have accessors named @code{url-@var{part}}, where
186 @var{part} is the name of one of the elements above, e.g.,
187 @code{url-host}. Similarly, there are setters of the form
188 @code{url-set-@var{part}}.
189
190 There are functions for parsing and unparsing between the string and
191 vector forms.
192
193 @defun url-generic-parse-url url
194 Return a parsed version of the string @var{url}.
195 @end defun
196
197 @defun url-recreate-url url
198 @cindex unparsing URLs
199 Recreates a URL string from the parsed @var{url}.
200 @end defun
201
202 @node Retrieving URLs
203 @chapter Retrieving URLs
204
205 @defun url-retrieve-synchronously url
206 Retrieve @var{url} synchronously and return a buffer containing the
207 data. @var{url} is either a string or a parsed URL structure. Return
208 @code{nil} if there are no data associated with it (the case for dired,
209 info, or mailto URLs that need no further processing).
210 @end defun
211
212 @defun url-retrieve url callback &optional cbargs
213 Retrieve @var{url} asynchronously and call @var{callback} with args
214 @var{cbargs} when finished. The callback is called when the object
215 has been completely retrieved, with the current buffer containing the
216 object and any MIME headers associated with it. @var{url} is either a
217 string or a parsed URL structure. Returns the buffer @var{url} will
218 load into, or @code{nil} if the process has already completed.
219 @end defun
220
221 @node Supported URL Types
222 @chapter Supported URL Types
223
224 @menu
225 * http/https:: Hypertext Transfer Protocol.
226 * file/ftp:: Local files and FTP archives.
227 * info:: Emacs `Info' pages.
228 * mailto:: Sending email.
229 * news/nntp/snews:: Usenet news.
230 * rlogin/telnet/tn3270:: Remote host connectivity.
231 * irc:: Internet Relay Chat.
232 * data:: Embedded data URLs.
233 * nfs:: Networked File System
234 @c * finger::
235 @c * gopher::
236 @c * netrek::
237 @c * prospero::
238 * cid:: Content-ID.
239 * about::
240 * ldap:: Lightweight Directory Access Protocol
241 * imap:: IMAP mailboxes.
242 * man:: Unix man pages.
243 @end menu
244
245 @node http/https
246 @section @code{http} and @code{https}
247
248 The scheme @code{http} is Hypertext Transfer Protocol. The library
249 supports version 1.1, specified in RFC 2616. (This supersedes 1.0,
250 defined in RFC 1945) HTTP URLs have the following form, where most of
251 the parts are optional:
252 @example
253 http://@var{user}:@var{password}@var{host}:@var{port}/@var{path}?@var{searchpart}#@var{fragment}
254 @end example
255 @c The @code{:@var{port}} part is optional, and @var{port} defaults to
256 @c 80. The @code{/@var{path}} part, if present, is a slash-separated
257 @c series elements. The @code{?@var{searchpart}}, if present, is the
258 @c query for a search or the content of a form submission. The
259 @c @code{#fragment} part, if present, is a location in the document.
260
261 The scheme @code{https} is a secure version of @code{http}, with
262 transmission via SSL. It is defined in RFC 2069. Its default port is
263 443. This scheme depends on SSL support in Emacs via the
264 @file{ssl.el} library and is actually implemented by forcing the
265 @code{ssl} gateway method to be used. @xref{Gateways in general}.
266
267 @defopt url-honor-refresh-requests
268 This controls honouring of HTTP @samp{Refresh} headers by which
269 servers can direct clients to reload documents from the same URL or a
270 or different one. @code{nil} means they will not be honoured,
271 @code{t} (the default) means they will always be honoured, and
272 otherwise the user will be asked on each request.
273 @end defopt
274
275
276 @menu
277 * Cookies::
278 * HTTP language/coding::
279 * HTTP URL Options::
280 * Dealing with HTTP documents::
281 @end menu
282
283 @node Cookies
284 @subsection Cookies
285
286 @defopt url-cookie-file
287 The file in which cookies are stored, defaulting to @file{cookies} in
288 the directory specified by @code{url-configuration-directory}.
289 @end defopt
290
291 @defopt url-cookie-confirmation
292 Specifies whether confirmation is require to accept cookies.
293 @end defopt
294
295 @defopt url-cookie-multiple-line
296 Specifies whether to put all cookies for the server on one line in the
297 HTTP request to satisfy broken servers like
298 @url{http://www.hotmail.com}.
299 @end defopt
300
301 @defopt url-cookie-trusted-urls
302 A list of regular expressions matching URLs from which to accept
303 cookies always.
304 @end defopt
305
306 @defopt url-cookie-untrusted-urls
307 A list of regular expressions matching URLs from which to reject
308 cookies always.
309 @end defopt
310
311 @defopt url-cookie-save-interval
312 The number of seconds between automatic saves of cookies to disk.
313 Default is one hour.
314 @end defopt
315
316
317 @node HTTP language/coding
318 @subsection Language and Encoding Preferences
319
320 HTTP allows clients to express preferences for the language and
321 encoding of documents which servers may honour. For each of these
322 variables, the value is a string; it can specify a single choice, or
323 it can be a comma-separated list.
324
325 Normally this list ordered by descending preference. However, each
326 element can be followed by @samp{;q=@var{priority}} to specify its
327 preference level, a decimal number from 0 to 1; e.g., for
328 @code{url-mime-language-string}, @w{@code{"de, en-gb;q=0.8,
329 en;q=0.7"}}. An element that has no @samp{;q} specification has
330 preference level 1.
331
332 @defopt url-mime-charset-string
333 @cindex character sets
334 @cindex coding systems
335 This variable specifies a preference for character sets when documents
336 can be served in more than one encoding.
337
338 HTTP allows specifying a series of MIME charsets which indicate your
339 preferred character set encodings, e.g., Latin-9 or Big5, and these
340 can be weighted. The default series is generated automatically from
341 the associated MIME types of all defined coding systems, sorted by the
342 coding system priority specified in Emacs. @xref{Recognize Coding, ,
343 Recognizing Coding Systems, emacs, The GNU Emacs Manual}.
344 @end defopt
345
346 @defopt url-mime-language-string
347 @cindex language preferences
348 A string specifying the preferred language when servers can serve
349 files in several languages. Use RFC 1766 abbreviations, e.g.,
350 @samp{en} for English, @samp{de} for German.
351
352 The string can be @code{"*"} to get the first available language (as
353 opposed to the default).
354 @end defopt
355
356 @node HTTP URL Options
357 @subsection HTTP URL Options
358
359 HTTP supports an @samp{OPTIONS} method describing things supported by
360 the URL@.
361
362 @defun url-http-options url
363 Returns a property list describing options available for URL. The
364 property list members are:
365
366 @table @code
367 @item methods
368 A list of symbols specifying what HTTP methods the resource
369 supports.
370
371 @item dav
372 @cindex DAV
373 A list of numbers specifying what DAV protocol/schema versions are
374 supported.
375
376 @item dasl
377 @cindex DASL
378 A list of supported DASL search types supported (string form).
379
380 @item ranges
381 A list of the units available for use in partial document fetches.
382
383 @item p3p
384 @cindex P3P
385 The @dfn{Platform For Privacy Protection} description for the resource.
386 Currently this is just the raw header contents.
387 @end table
388
389 @end defun
390
391 @node Dealing with HTTP documents
392 @subsection Dealing with HTTP documents
393
394 HTTP URLs are retrieved into a buffer containing the HTTP headers
395 followed by the body. Since the headers are quasi-MIME, they may be
396 processed using the MIME library. @xref{Top,, Emacs MIME,
397 emacs-mime, The Emacs MIME Manual}. The URL package provides a
398 function to do this in general:
399
400 @defun url-decode-text-part handle &optional coding
401 This function decodes charset-encoded text in the current buffer. In
402 Emacs, the buffer is expected to be unibyte initially and is set to
403 multibyte after decoding.
404 HANDLE is the MIME handle of the original part. CODING is an explicit
405 coding to use, overriding what the MIME headers specify.
406 The coding system used for the decoding is returned.
407
408 Note that this function doesn't deal with @samp{http-equiv} charset
409 specifications in HTML @samp{<meta>} elements.
410 @end defun
411
412 @node file/ftp
413 @section file and ftp
414 @cindex files
415 @cindex FTP
416 @cindex File Transfer Protocol
417 @cindex compressed files
418 @findex dired
419
420 @example
421 ftp://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
422 file://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
423 @end example
424
425 These schemes are defined in RFC 1808.
426 @samp{ftp:} and @samp{file:} are synonymous in this library. They
427 allow reading arbitrary files from hosts. Either @samp{ange-ftp}
428 (Emacs) or @samp{efs} (XEmacs) is used to retrieve them from remote
429 hosts. Local files are accessed directly.
430
431 Compressed files are handled, but support is hard-coded so that
432 @code{jka-compr-compression-info-list} and so on have no affect.
433 Suffixes recognized are @samp{.z}, @samp{.gz}, @samp{.Z} and
434 @samp{.bz2}.
435
436 @defopt url-directory-index-file
437 The filename to look for when indexing a directory, default
438 @samp{"index.html"}. If this file exists, and is readable, then it
439 will be viewed instead of using @code{dired} to view the directory.
440 @end defopt
441
442 @node info
443 @section info
444 @cindex Info
445 @cindex Texinfo
446 @findex Info-goto-node
447
448 @example
449 info:@var{file}#@var{node}
450 @end example
451
452 Info URLs are not officially defined. They invoke
453 @code{Info-goto-node} with argument @samp{(@var{file})@var{node}}.
454 @samp{#@var{node}} is optional, defaulting to @samp{Top}.
455
456 @node mailto
457 @section mailto
458
459 @cindex mailto
460 @cindex email
461 A mailto URL will send an email message to the address in the
462 URL, for example @samp{mailto:foo@@bar.com} would compose a
463 message to @samp{foo@@bar.com}.
464
465 @defopt url-mail-command
466 @vindex mail-user-agent
467 The function called whenever url needs to send mail. This should
468 normally be left to default from @var{mail-user-agent}. @xref{Mail
469 Methods, , Mail-Composition Methods, emacs, The GNU Emacs Manual}.
470 @end defopt
471
472 An @samp{X-Url-From} header field containing the URL of the document
473 that contained the mailto URL is added if that URL is known.
474
475 RFC 2368 extends the definition of mailto URLs in RFC 1738.
476 The form of a mailto URL is
477 @example
478 @samp{mailto:@var{mailbox}[?@var{header}=@var{contents}[&@var{header}=@var{contents}]]}
479 @end example
480 @noindent where an arbitrary number of @var{header}s can be added. If the
481 @var{header} is @samp{body}, then @var{contents} is put in the body
482 otherwise a @var{header} header field is created with @var{contents}
483 as its contents. Note that the URL library does not consider any
484 headers `dangerous' so you should check them before sending the
485 message.
486
487 @c Fixme: update
488 Email messages are defined in @sc{rfc}822.
489
490 @node news/nntp/snews
491 @section @code{news}, @code{nntp} and @code{snews}
492 @cindex news
493 @cindex network news
494 @cindex usenet
495 @cindex NNTP
496 @cindex snews
497
498 @c draft-gilman-news-url-01
499 The network news URL scheme take the following forms following RFC
500 1738 except that for compatibility with other clients, host and port
501 fields may be included in news URLs though they are properly only
502 allowed for nntp an snews.
503
504 @table @samp
505 @item news:@var{newsgroup}
506 Retrieves a list of messages in @var{newsgroup};
507 @item news:@var{message-id}
508 Retrieves the message with the given @var{message-id};
509 @item news:*
510 Retrieves a list of all available newsgroups;
511 @item nntp://@var{host}:@var{port}/@var{newsgroup}
512 @itemx nntp://@var{host}:@var{port}/@var{message-id}
513 @itemx nntp://@var{host}:@var{port}/*
514 Similar to the @samp{news} versions.
515 @end table
516
517 @samp{:@var{port}} is optional and defaults to :119.
518
519 @samp{snews} is the same as @samp{nntp} except that the default port
520 is :563.
521 @cindex SSL
522 (It is tunneled through SSL.)
523
524 An @samp{nntp} URL is the same as a news URL, except that the URL may
525 specify an article by its number.
526
527 @defopt url-news-server
528 This variable can be used to override the default news server.
529 Usually this will be set by the Gnus package, which is used to fetch
530 news.
531 @cindex environment variable
532 @vindex NNTPSERVER
533 It may be set from the conventional environment variable
534 @code{NNTPSERVER}.
535 @end defopt
536
537 @node rlogin/telnet/tn3270
538 @section rlogin, telnet and tn3270
539 @cindex rlogin
540 @cindex telnet
541 @cindex tn3270
542 @cindex terminal emulation
543 @findex terminal-emulator
544
545 These URL schemes from RFC 1738 for logon via a terminal emulator have
546 the form
547 @example
548 telnet://@var{user}:@var{password}@@@var{host}:@var{port}
549 @end example
550 but the @code{:@var{password}} component is ignored.
551
552 To handle rlogin, telnet and tn3270 URLs, a @code{rlogin},
553 @code{telnet} or @code{tn3270} (the program names and arguments are
554 hardcoded) session is run in a @code{terminal-emulator} buffer.
555 Well-known ports are used if the URL does not specify a port.
556
557 @node irc
558 @section irc
559 @cindex IRC
560 @cindex Internet Relay Chat
561 @cindex ZEN IRC
562 @c Fixme: reference (was http://www.w3.org/Addressing/draft-mirashi-url-irc-01.txt)
563 @dfn{Internet Relay Chat} (IRC) is handled by handing off the @sc{irc}
564 session to a function named in @code{url-irc-function}.
565
566 @defopt url-irc-function
567 A function to actually open an IRC connection.
568 This function
569 must take five arguments, @var{host}, @var{port}, @var{channel},
570 @var{user} and @var{password}. The @var{channel} argument specifies the
571 channel to join immediately, this can be @code{nil}. By default this is
572 @code{url-irc-zenirc}.
573 @end defopt
574 @defun url-irc-zenirc host port channel user password
575 Processes the arguments and lets @code{zenirc} handle the session.
576 @end defun
577
578 @node data
579 @section data
580 @cindex data URLs
581
582 @example
583 data:@r{[}@var{media-type}@r{]}@r{[};@var{base64}@r{]},@var{data}
584 @end example
585
586 Data URLs contain MIME data in the URL itself. They are defined in
587 RFC 2397.
588
589 @var{media-type} is a MIME @samp{Content-Type} string, possibly
590 including parameters. It defaults to
591 @samp{text/plain;charset=US-ASCII}. The @samp{text/plain} can be
592 omitted but the charset parameter supplied. If @samp{;base64} is
593 present, the @var{data} are base64-encoded.
594
595 @node nfs
596 @section nfs
597 @cindex NFS
598 @cindex Network File System
599 @cindex automounter
600
601 @example
602 nfs://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
603 @end example
604
605 The @samp{nfs:} scheme is defined in RFC 2224. It is similar to
606 @samp{ftp:} except that it points to a file on a remote host that is
607 handled by the automounter on the local host.
608
609 @defvar url-nfs-automounter-directory-spec
610 @end defvar
611 A string saying how to invoke the NFS automounter. Certain @samp{%}
612 sequences are recognized:
613
614 @table @samp
615 @item %h
616 The hostname of the NFS server;
617 @item %n
618 The port number of the NFS server;
619 @item %u
620 The username to use to authenticate;
621 @item %p
622 The password to use to authenticate;
623 @item %f
624 The filename on the remote server;
625 @item %%
626 A literal @samp{%}.
627 @end table
628
629 Each can be used any number of times.
630
631 @node cid
632 @section cid
633 @cindex Content-ID
634
635 RFC 2111
636
637 @node about
638 @section about
639
640 @node ldap
641 @section ldap
642 @cindex LDAP
643 @cindex Lightweight Directory Access Protocol
644
645 The LDAP scheme is defined in RFC 2255.
646
647 @node imap
648 @section imap
649 @cindex IMAP
650
651 RFC 2192
652
653 @node man
654 @section man
655 @cindex @command{man}
656 @cindex Unix man pages
657 @findex man
658
659 @example
660 @samp{man:@var{page-spec}}
661 @end example
662
663 This is a non-standard scheme. @var{page-spec} is passed directly to
664 the Lisp @code{man} function.
665
666 @node Defining New URLs
667 @chapter Defining New URLs
668
669 @menu
670 * Naming conventions::
671 * Required functions::
672 * Optional functions::
673 * Asynchronous fetching::
674 * Supporting file-name-handlers::
675 @end menu
676
677 @node Naming conventions
678 @section Naming conventions
679
680 @node Required functions
681 @section Required functions
682
683 @node Optional functions
684 @section Optional functions
685
686 @node Asynchronous fetching
687 @section Asynchronous fetching
688
689 @node Supporting file-name-handlers
690 @section Supporting file-name-handlers
691
692 @node General Facilities
693 @chapter General Facilities
694
695 @menu
696 * Disk Caching::
697 * Proxies::
698 * Gateways in general::
699 * History::
700 @end menu
701
702 @node Disk Caching
703 @section Disk Caching
704 @cindex Caching
705 @cindex Persistent Cache
706 @cindex Disk Cache
707
708 The disk cache stores retrieved documents locally, whence they can be
709 retrieved more quickly. When requesting a URL that is in the cache,
710 the library checks to see if the page has changed since it was last
711 retrieved from the remote machine. If not, the local copy is used,
712 saving the transmission over the network.
713 @cindex Cleaning the cache
714 @cindex Clearing the cache
715 @cindex Cache cleaning
716 Currently the cache isn't cleared automatically.
717 @c Running the @code{clean-cache} shell script
718 @c fist is recommended, to allow for future cleaning of the cache. This
719 @c shell script will remove all files that have not been accessed since it
720 @c was last run. To keep the cache pared down, it is recommended that this
721 @c script be run from @i{at} or @i{cron} (see the manual pages for
722 @c crontab(5) or at(1) for more information)
723
724 @defopt url-automatic-caching
725 Setting this variable non-@code{nil} causes documents to be cached
726 automatically.
727 @end defopt
728
729 @defopt url-cache-directory
730 This variable specifies the
731 directory to store the cache files. It defaults to sub-directory
732 @file{cache} of @code{url-configuration-directory}.
733 @end defopt
734
735 @c Fixme: function v. option, but neither used.
736 @c @findex url-cache-expired
737 @c @defopt url-cache-expired
738 @c This is a function to decide whether or not a cache entry has expired.
739 @c It takes two times as it parameters and returns non-@code{nil} if the
740 @c second time is ``too old'' when compared with the first time.
741 @c @end defopt
742
743 @defopt url-cache-creation-function
744 The cache relies on a scheme for mapping URLs to files in the cache.
745 This variable names a function which sets the type of cache to use.
746 It takes a URL as argument and returns the absolute file name of the
747 corresponding cache file. The two supplied possibilities are
748 @code{url-cache-create-filename-using-md5} and
749 @code{url-cache-create-filename-human-readable}.
750 @end defopt
751
752 @defun url-cache-create-filename-using-md5 url
753 Creates a cache file name from @var{url} using MD5 hashing.
754 @findex md5
755 This is creates entries with very few cache collisions and is fast if
756 you have the @code{md5} function as a primitive (Emacs 21 and XEmacs).
757 @smallexample
758 (url-cache-create-filename-using-md5 "http://www.example.com/foo/bar")
759 @result{} "/home/fx/.url/cache/fx/http/com/example/www/b8a35774ad20db71c7c3409a5410e74f"
760 @end smallexample
761 @end defun
762
763 @defun url-cache-create-filename-human-readable url
764 Creates a cache file name from @var{url} more obviously connected to
765 @var{url} than for @code{url-cache-create-filename-using-md5}, but
766 more likely to conflict with other files.
767 @smallexample
768 (url-cache-create-filename-human-readable "http://www.example.com/foo/bar")
769 @result{} "/home/fx/.url/cache/fx/http/com/example/www/foo/bar"
770 @end smallexample
771 @end defun
772
773 @c Fixme: never actually used currently?
774 @c @defopt url-standalone-mode
775 @c @cindex Relying on cache
776 @c @cindex Cache only mode
777 @c @cindex Standalone mode
778 @c If this variable is non-@code{nil}, the library relies solely on the
779 @c cache for fetching documents and avoids checking if they have changed
780 @c on remote servers.
781 @c @end defopt
782
783 @c With a large cache of documents on the local disk, it can be very handy
784 @c when traveling, or any other time the network connection is not active
785 @c (a laptop with a dial-on-demand PPP connection, etc). Emacs/W3 can rely
786 @c solely on its cache, and avoid checking to see if the page has changed
787 @c on the remote server. In the case of a dial-on-demand PPP connection,
788 @c this will keep the phone line free as long as possible, only bringing up
789 @c the PPP connection when asking for a page that is not located in the
790 @c cache. This is very useful for demonstrations as well.
791
792 @node Proxies
793 @section Proxies and Gatewaying
794
795 @c fixme: check/document url-ns stuff
796 @cindex proxy servers
797 @cindex proxies
798 @cindex environment variables
799 @vindex HTTP_PROXY
800 Proxy servers are commonly used to provide gateways through firewalls
801 or as caches serving some more-or-less local network. Each protocol
802 (HTTP, FTP, etc.)@: can have a different gateway server. Proxying is
803 conventionally configured commonly amongst different programs through
804 environment variables of the form @code{@var{protocol}_proxy}, where
805 @var{protocol} is one of the supported network protocols (@code{http},
806 @code{ftp} etc.). The library recognizes such variables in either
807 upper or lower case. Their values are of one of the forms:
808 @itemize @bullet
809 @item @code{@var{host}:@var{port}}
810 @item A full URL;
811 @item Simply a host name.
812 @end itemize
813
814 @vindex NO_PROXY
815 The @code{NO_PROXY} environment variable specifies URLs that should be
816 excluded from proxying (on servers that should be contacted directly).
817 This should be a comma-separated list of hostnames, domain names, or a
818 mixture of both. Asterisks can be used as wildcards, but other
819 clients may not support that. Domain names may be indicated by a
820 leading dot. For example:
821 @example
822 NO_PROXY="*.aventail.com,home.com,.seanet.com"
823 @end example
824 @noindent says to contact all machines in the @samp{aventail.com} and
825 @samp{seanet.com} domains directly, as well as the machine named
826 @samp{home.com}. If @code{NO_PROXY} isn't defined, @code{no_PROXY}
827 and @code{no_proxy} are also tried, in that order.
828
829 Proxies may also be specified directly in Lisp.
830
831 @defopt url-proxy-services
832 This variable is an alist of URL schemes and proxy servers that
833 gateway them. The items are of the form @w{@code{(@var{scheme}
834 . @var{host}:@var{portnumber})}}, says that the URL @var{scheme} is
835 gatewayed through @var{portnumber} on the specified @var{host}. An
836 exception is the pseudo scheme @code{"no_proxy"}, which is paired with
837 a regexp matching host names not to be proxied. This variable is
838 initialized from the environment as above.
839
840 @example
841 (setq url-proxy-services
842 '(("http" . "proxy.aventail.com:80")
843 ("no_proxy" . "^.*\\(aventail\\|seanet\\)\\.com")))
844 @end example
845 @end defopt
846
847 @node Gateways in general
848 @section Gateways in General
849 @cindex gateways
850 @cindex firewalls
851
852 The library provides a general gateway layer through which all
853 networking passes. It can both control access to the network and
854 provide access through gateways in firewalls. This may make direct
855 connections in some cases and pass through some sort of gateway in
856 others.@footnote{Proxies (which only operate over HTTP) are
857 implemented using this.} The library's basic function responsible for
858 making connections is @code{url-open-stream}.
859
860 @defun url-open-stream name buffer host service
861 @cindex opening a stream
862 @cindex stream, opening
863 Open a stream to @var{host}, possibly via a gateway. The other
864 arguments are as for @code{open-network-stream}. This will not make a
865 connection if @code{url-gateway-unplugged} is non-@code{nil}.
866 @end defun
867
868 @defvar url-gateway-local-host-regexp
869 This is a regular expression that matches local hosts that do not
870 require the use of a gateway. If @code{nil}, all connections are made
871 through the gateway.
872 @end defvar
873
874 @defvar url-gateway-method
875 This variable controls which gateway method is used. It may be useful
876 to bind it temporarily in some applications. It has values taken from
877 a list of symbols. Possible values are:
878
879 @table @code
880 @item telnet
881 @cindex @command{telnet}
882 Use this method if you must first telnet and log into a gateway host,
883 and then run telnet from that host to connect to outside machines.
884
885 @item rlogin
886 @cindex @command{rlogin}
887 This method is identical to @code{telnet}, but uses @command{rlogin}
888 to log into the remote machine without having to send the username and
889 password over the wire every time.
890
891 @item socks
892 @cindex @sc{socks}
893 Use if the firewall has a @sc{socks} gateway running on it. The
894 @sc{socks} v5 protocol is defined in RFC 1928.
895
896 @c @item ssl
897 @c This probably shouldn't be documented
898 @c Fixme: why not? -- fx
899
900 @item native
901 This method uses Emacs's builtin networking directly. This is the
902 default. It can be used only if there is no firewall blocking access.
903 @end table
904 @end defvar
905
906 The following variables control the gateway methods.
907
908 @defopt url-gateway-telnet-host
909 The gateway host to telnet to. Once logged in there, you then telnet
910 out to the hosts you want to connect to.
911 @end defopt
912 @defopt url-gateway-telnet-parameters
913 This should be a list of parameters to pass to the @command{telnet} program.
914 @end defopt
915 @defopt url-gateway-telnet-password-prompt
916 This is a regular expression that matches the password prompt when
917 logging in.
918 @end defopt
919 @defopt url-gateway-telnet-login-prompt
920 This is a regular expression that matches the username prompt when
921 logging in.
922 @end defopt
923 @defopt url-gateway-telnet-user-name
924 The username to log in with.
925 @end defopt
926 @defopt url-gateway-telnet-password
927 The password to send when logging in.
928 @end defopt
929 @defopt url-gateway-prompt-pattern
930 This is a regular expression that matches the shell prompt.
931 @end defopt
932
933 @defopt url-gateway-rlogin-host
934 Host to @samp{rlogin} to before telnetting out.
935 @end defopt
936 @defopt url-gateway-rlogin-parameters
937 Parametres to pass to @samp{rsh}.
938 @end defopt
939 @defopt url-gateway-rlogin-user-name
940 User name to use when logging in to the gateway.
941 @end defopt
942 @defopt url-gateway-prompt-pattern
943 This is a regular expression that matches the shell prompt.
944 @end defopt
945
946 @defopt socks-server
947 This specifies the default server, it takes the form
948 @w{@code{("Default server" @var{server} @var{port} @var{version})}}
949 where @var{version} can be either 4 or 5.
950 @end defopt
951 @defvar socks-password
952 If this is @code{nil} then you will be asked for the password,
953 otherwise it will be used as the password for authenticating you to
954 the @sc{socks} server.
955 @end defvar
956 @defvar socks-username
957 This is the username to use when authenticating yourself to the
958 @sc{socks} server. By default this is your login name.
959 @end defvar
960 @defvar socks-timeout
961 This controls how long, in seconds, to wait for responses from the
962 @sc{socks} server; it is 5 by default.
963 @end defvar
964 @c fixme: these have been effectively commented-out in the code
965 @c @defopt socks-server-aliases
966 @c This a list of server aliases. It is a list of aliases of the form
967 @c @var{(alias hostname port version)}.
968 @c @end defopt
969 @c @defopt socks-network-aliases
970 @c This a list of network aliases. Each entry in the list takes the form
971 @c @var{(alias (network))} where @var{alias} is a string that names the
972 @c @var{network}. The networks can contain a pair (not a dotted pair) of
973 @c @sc{ip} addresses which specify a range of @sc{ip} addresses, an @sc{ip}
974 @c address and a netmask, a domain name or a unique hostname or @sc{ip}
975 @c address.
976 @c @end defopt
977 @c @defopt socks-redirection-rules
978 @c This a list of redirection rules. Each rule take the form
979 @c @var{(Destination network Connection type)} where @var{Destination
980 @c network} is a network alias from @code{socks-network-aliases} and
981 @c @var{Connection type} can be @code{nil} in which case a direct
982 @c connection is used, or it can be an alias from
983 @c @code{socks-server-aliases} in which case that server is used as a
984 @c proxy.
985 @c @end defopt
986 @defopt socks-nslookup-program
987 @cindex @command{nslookup}
988 This the @samp{nslookup} program. It is @code{"nslookup"} by default.
989 @end defopt
990
991 @menu
992 * Suppressing network connections::
993 @end menu
994 @c * Broken hostname resolution::
995
996 @node Suppressing network connections
997 @subsection Suppressing Network Connections
998
999 @cindex network connections, suppressing
1000 @cindex suppressing network connections
1001 @cindex bugs, HTML
1002 @cindex HTML `bugs'
1003 In some circumstances it is desirable to suppress making network
1004 connections. A typical case is when rendering HTML in a mail user
1005 agent, when external URLs should not be activated, particularly to
1006 avoid `bugs' which `call home' by fetch single-pixel images and the
1007 like. To arrange this, bind the following variable for the duration
1008 of such processing.
1009
1010 @defvar url-gateway-unplugged
1011 If this variable is non-@code{nil} new network connections are never
1012 opened by the URL library.
1013 @end defvar
1014
1015 @c @node Broken hostname resolution
1016 @c @subsection Broken Hostname Resolution
1017
1018 @c @cindex hostname resolver
1019 @c @cindex resolver, hostname
1020 @c Some C libraries do not include the hostname resolver routines in
1021 @c their static libraries. If Emacs was linked statically, and was not
1022 @c linked with the resolver libraries, it will not be able to get to any
1023 @c machines off the local network. This is characterized by being able
1024 @c to reach someplace with a raw ip number, but not its hostname
1025 @c (@url{http://129.79.254.191/} works, but
1026 @c @url{http://www.cs.indiana.edu/} doesn't). This used to happen on
1027 @c SunOS4 and Ultrix, but is now probably now rare. If Emacs can't be
1028 @c rebuilt linked against the resolver library, it can use the external
1029 @c @command{nslookup} program instead.
1030
1031 @c @defopt url-gateway-broken-resolution
1032 @c @cindex @code{nslookup} program
1033 @c @cindex program, @code{nslookup}
1034 @c If non-@code{nil}, this variable says to use the program specified by
1035 @c @code{url-gateway-nslookup-program} program to do hostname resolution.
1036 @c @end defopt
1037
1038 @c @defopt url-gateway-nslookup-program
1039 @c The name of the program to do hostname lookup if Emacs can't do it
1040 @c directly. This program should expect a single argument on the command
1041 @c line---the hostname to resolve---and should produce output similar to
1042 @c the standard Unix @command{nslookup} program:
1043 @c @example
1044 @c Name: www.cs.indiana.edu
1045 @c Address: 129.79.254.191
1046 @c @end example
1047 @c @end defopt
1048
1049 @node History
1050 @section History
1051
1052 The library can maintain a global history list tracking URLs accessed.
1053 URL completion can be done from it. The history mechanism is set up
1054 @findex url-do-setup
1055 automatically via @code{url-do-setup} when it is configured to be on.
1056 Note that the size of the history list is currently not limited.
1057
1058 @vindex url-history-hash-table
1059 The history `list' is actually a hash table,
1060 @code{url-history-hash-table}. It contains access times keyed by URL
1061 strings. The times are in the format returned by @code{current-time}.
1062
1063 @defun url-history-update-url url time
1064 This function updates the history table with an entry for @var{url}
1065 accessed at the given @var{time}.
1066 @end defun
1067
1068 @defopt url-history-track
1069 If non-@code{nil}, the library will keep track of all the URLs
1070 accessed. If it is @code{t}, the list is saved to disk at the end of
1071 each Emacs session. The default is @code{nil}.
1072 @end defopt
1073
1074 @defopt url-history-file
1075 The file storing the history list between sessions. It defaults to
1076 @file{history} in @code{url-configuration-directory}.
1077 @end defopt
1078
1079 @defopt url-history-save-interval
1080 @findex url-history-setup-save-timer
1081 The number of seconds between automatic saves of the history list.
1082 Default is one hour. Note that if you change this variable directly,
1083 rather than using Custom, after @code{url-do-setup} has been run, you
1084 need to run the function @code{url-history-setup-save-timer}.
1085 @end defopt
1086
1087 @defun url-history-parse-history &optional fname
1088 Parses the history file @var{fname} (default @code{url-history-file})
1089 and sets up the history list.
1090 @end defun
1091
1092 @defun url-history-save-history &optional fname
1093 Saves the current history to file @var{fname} (default
1094 @code{url-history-file}).
1095 @end defun
1096
1097 @defun url-completion-function string predicate function
1098 You can use this function to do completion of URLs from the history.
1099 @end defun
1100
1101 @node Customization
1102 @chapter Customization
1103
1104 @section Environment Variables
1105
1106 @cindex environment variables
1107 The following environment variables affect the library's operation at
1108 startup.
1109
1110 @table @code
1111 @item TMPDIR
1112 @vindex TMPDIR
1113 @vindex url-temporary-directory
1114 If this is defined, @var{url-temporary-directory} is initialized from
1115 it.
1116 @end table
1117
1118 @section General User Options
1119
1120 The following user options, settable with Customize, affect the
1121 general operation of the package.
1122
1123 @defopt url-debug
1124 @cindex debugging
1125 Specifies the types of debug messages the library which are logged to
1126 the @code{*URL-DEBUG*} buffer.
1127 @code{t} means log all messages.
1128 A number means log all messages and show them with @code{message}.
1129 If may also be a list of the types of messages to be logged.
1130 @end defopt
1131 @defopt url-personal-mail-address
1132 @end defopt
1133 @defopt url-privacy-level
1134 @end defopt
1135 @defopt url-uncompressor-alist
1136 @end defopt
1137 @defopt url-passwd-entry-func
1138 @end defopt
1139 @defopt url-standalone-mode
1140 @end defopt
1141 @defopt url-bad-port-list
1142 @end defopt
1143 @defopt url-max-password-attempts
1144 @end defopt
1145 @defopt url-temporary-directory
1146 @end defopt
1147 @defopt url-show-status
1148 @end defopt
1149 @defopt url-confirmation-func
1150 The function to use for asking yes or no functions. This is normally
1151 either @code{y-or-n-p} or @code{yes-or-no-p}, but could be another
1152 function taking a single argument (the prompt) and returning @code{t}
1153 only if an affirmative answer is given.
1154 @end defopt
1155 @defopt url-gateway-method
1156 @c fixme: describe gatewaying
1157 A symbol specifying the type of gateway support to use for connections
1158 from the local machine. The supported methods are:
1159
1160 @table @code
1161 @item telnet
1162 Run telnet in a subprocess to connect;
1163 @item rlogin
1164 Rlogin to another machine to connect;
1165 @item socks
1166 Connect through a socks server;
1167 @item ssl
1168 Connect with SSL;
1169 @item native
1170 Connect directly.
1171 @end table
1172 @end defopt
1173
1174 @node Function Index
1175 @unnumbered Command and Function Index
1176 @printindex fn
1177
1178 @node Variable Index
1179 @unnumbered Variable Index
1180 @printindex vr
1181
1182 @node Concept Index
1183 @unnumbered Concept Index
1184 @printindex cp
1185
1186 @setchapternewpage odd
1187 @contents
1188 @bye
1189
1190 @ignore
1191 arch-tag: c96be356-7e2d-4196-bcda-b13246c5c3f0
1192 @end ignore