Template talk:Cite FTP

Archive URLs

[edit]

Hi User:Trappist the monk,

Starting to convert ftp:// links. There are cases like 1928 Austin city plan with a dead FTP link:

  • Cite web: Koch & Fowler (January 14, 1928). "A City Plan for Austin, Texas" (PDF). City of Austin. Retrieved March 26, 2021.

This is expected. But unexpectedly the WaybackMachine has an archive available: https://web.archive.org/web/20220216082907/ftp://ftp.austintexas.gov/GIS-Data/planning/compplan/1927_Plan.pdf

Similarly at Salem, Oregon there is a live FTP link and live Wayback link:

This signifies the Wayback Machine is actually capable of saving pure FTP links, at least sometimes (FTP has a lot of variations and complications).

I added a new paragraph in the instructions Special:Diff/1291377442/1291528534 for cases with an available Wayback link, to use {{cite web}} with |url-status=dead. This is fine, except sometimes the FTP link is not dead (for FTP clients), so I expect some editors will complain. My thoughts are who cares, the universe of FTP users vs. Web users is so lopsided we are doing right thing by foregrounding the available Web link. And I don't think we should modify CS1|2 to have a new url-status feature with all the complications that creates in the code, documentation, tools and bots. -- GreenC 20:20, 21 May 2025 (UTC)[reply]

Ok, works for me.
Trappist the monk (talk) 22:59, 21 May 2025 (UTC)[reply]

Conversion process

[edit]

Details of conversion program.


  • A) Maintain native CS1|2 templates (eg. cite web, cite journal, etc) when there is a working HTTPS link in either |url= and/or |archive-url=
  • B) Prioritize access to HTTPS - FTP is not accessible in web browsers without special software, whereas HTTPS works for everyone. This means even if an FTP link works with FTP client software, if there is an HTTPS alternative, such as via a Wayback URL, the Wayback URL is prioritized through a CS1|2 template + |url-status=dead
  • C) If no HTTPS is available, convert CS1|2 templates to {{Cite FTP}}, which displays help information about accessing FTP links, and adds tracking categories specific to native FTP links.

Tools:

  • lftp - works with a variety of FTP/S servers and auto negotiates correctly
  • API:Parsing wikitext - verify newly created templates, checking for red errors

Steps:

  1. Check if the link works by converting to https:// - if so, change the link and exit
  2. Check if a Wayback Machine link exists. If so, keep the CS1|2 template and add |archive-url= and |url-status=dead, then exit
  3. Convert template to {{Cite FTP}}. If FTP link is not working add |url-status=dead, exit.

Notes:

Note 1: For square links there is a similar process except it only changes to either live https:// or Wayback, or adds a {{dead link}} template. It does not convert to {{Cite FTP}}. If the FTP link is working in a FTP client, and there is also a working Wayback link, it favors using the Wayback link because it is more accessible without the need for a dedicated client. If the FTP link is working and there is no Wayback, it retains the original square FTP link. If the FTP link is not working and there is no Wayback replacement, it adds a {{dead link}}.
Note 2: Due to thousands of {{Cite map}} templates containing FTP links, many with broken archive URLs and no Wayback replacement, the program will convert to {{Cite FTP}} where possible with |url-status=dead, and delete or replace any broken archive URLs.
Note 3: Other templates like news, journal, report, etc.. when converting to {{Cite FTP}} per step #3 above, it verifies using API:Parsing_wikitext that it doesn't produce a red error caused by unsupported parameters.

-- GreenC 19:23, 22 May 2025 (UTC)[reply]

Conversion result

[edit]

Stats:

Convert to {{Cite FTP}}   ChangeCiteFTP FTP1.1 = 5,334   DeadCiteFTP FTP1.3 (informational) = 4,745  Modify square link   DeadSquareFTP FTP3.4 = 2,272   ChangeSquare2WaybackFTP FTP3.2 = 244   WorkingSquareFTP FTP3.5 = 76   ChangeSquareFTP FTP3.1 = 19  Add archive to {{Cite web/journal/etc}}   ChangeCiteWeb FTP2.1 = 1,219   SkipChangeCiteWeb FTP2.2 = 5  Misc   FoundSquareWaybackFTP FTP4.1 = 376   InvalidSourceFTP FTP5.1 = 299   NewCiteFTPError FTP7.1 = 46   NewCiteWebError FTP7.2 = 1 

Key

  • ChangeCiteFTP FTP1.1 = Convert a CS1|2 template to {{Cite FTP}}
  • DeadCiteFTP FTP1.3 = Conversion has a dead ftp link
  • DeadSquareFTP FTP3.4 = Square FTP link that is dead ie. {{dead link}} tagged
  • ChangeSquare2WaybackFTP FTP3.2 = Square FTP link that converted to a square Wayback link
  • WorkingSquareFTP FTP3.5 = Preexisting square Wayback FTP link
  • ChangeSquareFTP FTP3.1 = Square FTP link that redirects to a new FTP link
  • ChangeCiteWeb FTP2.1 = CS1|2 template with a FTP link that had a Wayback link added
  • SkipChangeCiteWeb FTP2.2 = TBD
  • FoundSquareWaybackFTP FTP4.1 = TBD
  • InvalidSourceFTP FTP5.1 = CS1|2 template unsupported during conversion (mostly {{Cite book}})
  • NewCiteFTPError FTP7.1 = During the conversion it generated a red error and was skipped
  • NewCiteWebError FTP7.2 = During the conversion it generated a red error and was skipped

Analysis

  • Of the 5,334 (FTP1.1) conversions to {{Cite FTP}}, 4,745 required a |url-status=dead ie. the FTP link is dead and no archive available. The remainder are working FTP links
  • Of the total number of square FTP links 2,611 (FTP3.*), 2,272 are dead with no archive available.
  • There were 1,219 (FTP2.1) FTP links in a CS1|2 template that had a working archive URL

Overall the conclusion is a very high rate of FTP link rot on Enwiki. -- GreenC 00:07, 28 May 2025 (UTC)[reply]

Just dropping by to say thanks for going through these. Most editors wouldn't know that those links are dead from just ignoring all FTP links, Rjjiii (talk) 00:14, 28 May 2025 (UTC)[reply]
Thanks! -- GreenC 02:13, 28 May 2025 (UTC)[reply]
Some of the dead ftp links actually are copied at live http links. For the USGS/EPA ecoregion maps, I restored the original citations with the live http links. This preserves the public domain attribution. — hike395 (talk) 08:33, 28 May 2025 (UTC)[reply]
Thank you User:hike395. Looking at this diff .. yeah that's what you have to do, it takes manual investigation and work, no bot can do it alone. Hopefully now by exposing the dead FTP links, it will encourage others to do the same. -- GreenC 16:01, 28 May 2025 (UTC)[reply]