置顶公告:【置顶】关于临时开启评论区所有功能的公告(2022.10.22) | 【置顶】关于本站Widget恢复使用的公告
  • 你好~!欢迎来到萌娘百科镜像站!如需查看或编辑,请联系本站管理员注册账号。
  • 本镜像站和其他萌娘百科的镜像站无关,请注意分别。

Module:Citation/CS1/sandbox

猛汉♂百科,万男皆可猛的百科全书!转载请标注来源页面的网页链接,并声明引自猛汉百科。内容不可商用。
跳到导航 跳到搜索
Template-info.svg 模块文档  [查看] [编辑] [历史] [刷新]

这个模块及其相关的下级模块支持Citation Style 1以及Citation Style 2格式的引文模板。这两者通常简称为CS1和CS2。

cs1 • cs2 模块
在用的 沙盒 简介
Module:Citation/CS1 Module:Citation/CS1/sandbox [编辑] Rendering and support functions
Module:Citation/CS1/Configuration Module:Citation/CS1/Configuration/sandbox [编辑] Translation tables; error and identifier handlers
Module:Citation/CS1/Whitelist Module:Citation/CS1/Whitelist/sandbox [编辑] List of active, deprecated, and obsolete cs1 • 2 parameters
Module:Citation/CS1/Date validation Module:Citation/CS1/Date validation/sandbox [编辑] Date format validation functions
Module:Citation/CS1/Identifiers Module:Citation/CS1/Identifiers/sandbox [编辑] Functions that support the named identifiers (isbn, doi, pmid, etc)
Module:Citation/CS1/Utilities Module:Citation/CS1/Utilities/sandbox [编辑] Common functions and tables
Module:Citation/CS1/COinS Module:Citation/CS1/COinS/sandbox [编辑] Functions that render a cs1 • 2 template's metadata
Module:Citation/CS1/Suggestions Module:Citation/CS1/Suggestions/sandbox [编辑] List that maps common erroneous parameter names to valid parameter names
  1. local z = {
  2. error_categories = {}; -- for categorizing citations that contain errors
  3. error_ids = {};
  4. message_tail = {};
  5. maintenance_cats = {}; -- for categorizing citations that aren't erroneous per se, but could use a little work
  6. properties_cats = {}; -- for categorizing citations based on certain properties, language of source for instance
  7. }
  8. --[[--------------------------< F O R W A R D D E C L A R A T I O N S >--------------------------------------
  9. ]]
  10. local dates, year_date_check -- functions in Module:Citation/CS1/Date_validation
  11. local cfg = {}; -- table of configuration tables that are defined in Module:Citation/CS1/Configuration
  12. local whitelist = {}; -- table of tables listing valid template parameter names; defined in Module:Citation/CS1/Whitelist
  13. --[[--------------------------< I S _ S E T >------------------------------------------------------------------
  14. Returns true if argument is set; false otherwise. Argument is 'set' when it exists (not nil) or when it is not an empty string.
  15. This function is global because it is called from both this module and from Date validation
  16. ]]
  17. function is_set( var )
  18. return not (var == nil or var == '');
  19. end
  20. --[[--------------------------< F I R S T _ S E T >------------------------------------------------------------
  21. Locates and returns the first set value in a table of values where the order established in the table,
  22. left-to-right (or top-to-bottom), is the order in which the values are evaluated. Returns nil if none are set.
  23. This version replaces the original 'for _, val in pairs do' and a similar version that used ipairs. With the pairs
  24. version the order of evaluation could not be guaranteed. With the ipairs version, a nil value would terminate
  25. the for-loop before it reached the actual end of the list.
  26. ]]
  27. local function first_set (list, count)
  28. local i = 1;
  29. while i <= count do -- loop through all items in list
  30. if is_set( list[i] ) then
  31. return list[i]; -- return the first set list member
  32. end
  33. i = i + 1; -- point to next
  34. end
  35. end
  36. --[[--------------------------< I N _ A R R A Y >--------------------------------------------------------------
  37. Whether needle is in haystack
  38. ]]
  39. local function in_array( needle, haystack )
  40. if needle == nil then
  41. return false;
  42. end
  43. for n,v in ipairs( haystack ) do
  44. if v == needle then
  45. return n;
  46. end
  47. end
  48. return false;
  49. end
  50. --[[--------------------------< S U B S T I T U T E >----------------------------------------------------------
  51. Populates numbered arguments in a message string using an argument table.
  52. ]]
  53. local function substitute( msg, args )
  54. return args and mw.message.newRawMessage( msg, args ):plain() or msg;
  55. end
  56. --[[--------------------------< E R R O R _ C O M M E N T >----------------------------------------------------
  57. Wraps error messages with css markup according to the state of hidden.
  58. ]]
  59. local function error_comment( content, hidden )
  60. return substitute( hidden and cfg.presentation['hidden-error'] or cfg.presentation['visible-error'], content );
  61. end
  62. --[[--------------------------< S E T _ E R R O R >--------------------------------------------------------------
  63. Sets an error condition and returns the appropriate error message. The actual placement of the error message in the output is
  64. the responsibility of the calling function.
  65. ]]
  66. local function set_error( error_id, arguments, raw, prefix, suffix )
  67. local error_state = cfg.error_conditions[ error_id ];
  68. prefix = prefix or "";
  69. suffix = suffix or "";
  70. if error_state == nil then
  71. error( cfg.messages['undefined_error'] );
  72. elseif is_set( error_state.category ) then
  73. table.insert( z.error_categories, error_state.category );
  74. end
  75. local message = substitute( error_state.message, arguments );
  76. message = message .. " ([[" .. cfg.messages['help page link'] ..
  77. "#" .. error_state.anchor .. "|" ..
  78. cfg.messages['help page label'] .. "]])";
  79. z.error_ids[ error_id ] = true;
  80. if in_array( error_id, { 'bare_url_missing_title', 'trans_missing_title' } )
  81. and z.error_ids['citation_missing_title'] then
  82. return '', false;
  83. end
  84. message = table.concat({ prefix, message, suffix });
  85. if raw == true then
  86. return message, error_state.hidden;
  87. end
  88. return error_comment( message, error_state.hidden );
  89. end
  90. --[[--------------------------< A D D _ M A I N T _ C A T >------------------------------------------------------
  91. Adds a category to z.maintenance_cats using names from the configuration file with additional text if any.
  92. To prevent duplication, the added_maint_cats table lists the categories by key that have been added to z.maintenance_cats.
  93. ]]
  94. local added_maint_cats = {} -- list of maintenance categories that have been added to z.maintenance_cats
  95. local function add_maint_cat (key, arguments)
  96. if not added_maint_cats [key] then
  97. added_maint_cats [key] = true; -- note that we've added this category
  98. table.insert( z.maintenance_cats, substitute (cfg.maint_cats [key], arguments)); -- make name then add to table
  99. end
  100. end
  101. --[[--------------------------< A D D _ P R O P _ C A T >--------------------------------------------------------
  102. Adds a category to z.properties_cats using names from the configuration file with additional text if any.
  103. ]]
  104. local added_prop_cats = {} -- list of property categories that have been added to z.properties_cats
  105. local function add_prop_cat (key, arguments)
  106. if not added_prop_cats [key] then
  107. added_prop_cats [key] = true; -- note that we've added this category
  108. table.insert( z.properties_cats, substitute (cfg.prop_cats [key], arguments)); -- make name then add to table
  109. end
  110. end
  111. --[[--------------------------< A D D _ V A N C _ E R R O R >----------------------------------------------------
  112. Adds a single Vancouver system error message to the template's output regardless of how many error actually exist.
  113. To prevent duplication, added_vanc_errs is nil until an error message is emitted.
  114. ]]
  115. local added_vanc_errs; -- flag so we only emit one Vancouver error / category
  116. local function add_vanc_error ()
  117. if not added_vanc_errs then
  118. added_vanc_errs = true; -- note that we've added this category
  119. table.insert( z.message_tail, { set_error( 'vancouver', {}, true ) } );
  120. end
  121. end
  122. --[[--------------------------< I S _ S C H E M E >------------------------------------------------------------
  123. does this thing that purports to be a uri scheme seem to be a valid scheme? The scheme is checked to see if it
  124. is in agreement with http://tools.ietf.org/html/std66#section-3.1 which says:
  125. Scheme names consist of a sequence of characters beginning with a
  126. letter and followed by any combination of letters, digits, plus
  127. ("+"), period ("."), or hyphen ("-").
  128. returns true if it does, else false
  129. ]]
  130. local function is_scheme (scheme)
  131. return scheme and scheme:match ('^%a[%a%d%+%.%-]*:'); -- true if scheme is set and matches the pattern
  132. end
  133. --[=[-------------------------< I S _ D O M A I N _ N A M E >--------------------------------------------------
  134. Does this thing that purports to be a domain name seem to be a valid domain name?
  135. Syntax defined here: http://tools.ietf.org/html/rfc1034#section-3.5
  136. BNF defined here: https://tools.ietf.org/html/rfc4234
  137. Single character names are generally reserved; see https://tools.ietf.org/html/draft-ietf-dnsind-iana-dns-01#page-15;
  138. see also [[Single-letter second-level domain]]
  139. list of tlds: https://www.iana.org/domains/root/db
  140. rfc952 (modified by rfc 1123) requires the first and last character of a hostname to be a letter or a digit. Between
  141. the first and last characters the name may use letters, digits, and the hyphen.
  142. Also allowed are IPv4 addresses. IPv6 not supported
  143. domain is expected to be stripped of any path so that the last character in the last character of the tld. tld
  144. is two or more alpha characters. Any preceding '//' (from splitting a url with a scheme) will be stripped
  145. here. Perhaps not necessary but retained incase it is necessary for IPv4 dot decimal.
  146. There are several tests:
  147. the first character of the whole domain name including subdomains must be a letter or a digit
  148. single-letter/digit second-level domains in the .org TLD
  149. q, x, and z SL domains in the .com TLD
  150. i and q SL domains in the .net TLD
  151. single-letter SL domains in the ccTLDs (where the ccTLD is two letters)
  152. two-character SL domains in gTLDs (where the gTLD is two or more letters)
  153. three-plus-character SL domains in gTLDs (where the gTLD is two or more letters)
  154. IPv4 dot-decimal address format; TLD not allowed
  155. returns true if domain appears to be a proper name and tld or IPv4 address, else false
  156. ]=]
  157. local function is_domain_name (domain)
  158. if not domain then
  159. return false; -- if not set, abandon
  160. end
  161. domain = domain:gsub ('^//', ''); -- strip '//' from domain name if present; done here so we only have to do it once
  162. if not domain:match ('^[%a%d]') then -- first character must be letter or digit
  163. return false;
  164. end
  165. if domain:match ('%f[%a%d][%a%d]%.org$') then -- one character .org hostname
  166. return true;
  167. elseif domain:match ('%f[%a][qxz]%.com$') then -- assigned one character .com hostname (x.com times out 2015-12-10)
  168. return true;
  169. elseif domain:match ('%f[%a][iq]%.net$') then -- assigned one character .net hostname (q.net registered but not active 2015-12-10)
  170. return true;
  171. elseif domain:match ('%f[%a%d][%a%d][%a%d%-]+[%a%d]%.xn%-%-[%a%d]+$') then -- internationalized domain name with ACE prefix
  172. return true;
  173. elseif domain:match ('%f[%a%d][%a%d]%.cash$') then -- one character/digit .cash hostname
  174. return true;
  175. elseif domain:match ('%f[%a%d][%a%d]%.%a%a$') then -- one character hostname and cctld (2 chars)
  176. return true;
  177. elseif domain:match ('%f[%a%d][%a%d][%a%d]%.%a%a+$') then -- two character hostname and tld
  178. return true;
  179. elseif domain:match ('%f[%a%d][%a%d][%a%d%-]+[%a%d]%.%a%a+$') then -- three or more character hostname.hostname or hostname.tld
  180. return true;
  181. elseif domain:match ('^%d%d?%d?%.%d%d?%d?%.%d%d?%d?%.%d%d?%d?') then -- IPv4 address
  182. return true;
  183. else
  184. return false;
  185. end
  186. end
  187. --[[--------------------------< I S _ U R L >------------------------------------------------------------------
  188. returns true if the scheme and domain parts of a url appear to be a valid url; else false.
  189. This function is the last step in the validation process. This function is separate because there are cases that
  190. are not covered by split_url(), for example is_parameter_ext_wikilink() which is looking for bracketted external
  191. wikilinks.
  192. ]]
  193. local function is_url (scheme, domain)
  194. if is_set (scheme) then -- if scheme is set check it and domain
  195. return is_scheme (scheme) and is_domain_name (domain);
  196. else
  197. return is_domain_name (domain); -- scheme not set when url is protocol relative
  198. end
  199. end
  200. --[[--------------------------< S P L I T _ U R L >------------------------------------------------------------
  201. Split a url into a scheme, authority indicator, and domain.
  202. If protocol relative url, return nil scheme and domain else return nil for both scheme and domain.
  203. When not protocol relative, get scheme, authority indicator, and domain. If there is an authority indicator (one
  204. or more '/' characters following the scheme's colon), make sure that there are only 2.
  205. ]]
  206. local function split_url (url_str)
  207. local scheme, authority, domain;
  208. url_str = url_str:gsub ('([%a%d])%.?[/%?#].*$', '%1'); -- strip FQDN terminator and path(/), query(?), fragment (#) (the capture prevents false replacement of '//')
  209. if url_str:match ('^//%S*') then -- if there is what appears to be a protocol relative url
  210. domain = url_str:match ('^//(%S*)')
  211. elseif url_str:match ('%S-:/*%S+') then -- if there is what appears to be a scheme, optional authority indicator, and domain name
  212. scheme, authority, domain = url_str:match ('(%S-:)(/*)(%S+)'); -- extract the scheme, authority indicator, and domain portions
  213. authority = authority:gsub ('//', '', 1); -- replace place 1 pair of '/' with nothing;
  214. if is_set(authority) then -- if anything left (1 or 3+ '/' where authority should be) then
  215. return scheme; -- return scheme only making domain nil which will cause an error message
  216. end
  217. domain = domain:gsub ('(%a):%d+', '%1'); -- strip port number if present
  218. end
  219. return scheme, domain;
  220. end
  221. --[[--------------------------< L I N K _ P A R A M _ O K >---------------------------------------------------
  222. checks the content of |title-link=, |series-link=, |author-link= etc for properly formatted content: no wikilinks, no urls
  223. Link parameters are to hold the title of a wikipedia article so none of the WP:TITLESPECIALCHARACTERS are allowed:
  224. # < > [ ] | { } _
  225. except the underscore which is used as a space in wiki urls and # which is used for section links
  226. returns false when the value contains any of these characters.
  227. When there are no illegal characters, this function returns TRUE if value DOES NOT appear to be a valid url (the
  228. |<param>-link= parameter is ok); else false when value appears to be a valid url (the |<param>-link= parameter is NOT ok).
  229. ]]
  230. local function link_param_ok (value)
  231. local scheme, domain;
  232. if value:find ('[<>%[%]|{}]') then -- if any prohibited characters
  233. return false;
  234. end
  235. scheme, domain = split_url (value); -- get scheme or nil and domain or nil from url;
  236. return not is_url (scheme, domain); -- return true if value DOES NOT appear to be a valid url
  237. end
  238. --[[--------------------------< C H E C K _ U R L >------------------------------------------------------------
  239. Determines whether a URL string appears to be valid.
  240. First we test for space characters. If any are found, return false. Then split the url into scheme and domain
  241. portions, or for protocol relative (//example.com) urls, just the domain. Use is_url() to validate the two
  242. portions of the url. If both are valid, or for protocol relative if domain is valid, return true, else false.
  243. ]]
  244. local function check_url( url_str )
  245. if nil == url_str:match ("^%S+$") then -- if there are any spaces in |url=value it can't be a proper url
  246. return false;
  247. end
  248. local scheme, domain;
  249. scheme, domain = split_url (url_str); -- get scheme or nil and domain or nil from url;
  250. return is_url (scheme, domain); -- return true if value appears to be a valid url
  251. end
  252. --[=[-------------------------< I S _ P A R A M E T E R _ E X T _ W I K I L I N K >----------------------------
  253. Return true if a parameter value has a string that begins and ends with square brackets [ and ] and the first
  254. non-space characters following the opening bracket appear to be a url. The test will also find external wikilinks
  255. that use protocol relative urls. Also finds bare urls.
  256. The frontier pattern prevents a match on interwiki links which are similar to scheme:path urls. The tests that
  257. find bracketed urls are required because the parameters that call this test (currently |title=, |chapter=, |work=,
  258. and |publisher=) may have wikilinks and there are articles or redirects like '//Hus' so, while uncommon, |title=[[//Hus]]
  259. is possible as might be [[en://Hus]].
  260. ]=]
  261. local function is_parameter_ext_wikilink (value)
  262. local scheme, domain;
  263. value = value:gsub ('([^%s/])/[%a%d].*', '%1'); -- strip path information (the capture prevents false replacement of '//')
  264. if value:match ('%f[%[]%[%a%S*:%S+.*%]') then -- if ext wikilink with scheme and domain: [xxxx://yyyyy.zzz]
  265. scheme, domain = value:match ('%f[%[]%[(%a%S*:)(%S+).*%]')
  266. elseif value:match ('%f[%[]%[//%S*%.%S+.*%]') then -- if protocol relative ext wikilink: [//yyyyy.zzz]
  267. domain = value:match ('%f[%[]%[//(%S*%.%S+).*%]');
  268. elseif value:match ('%a%S*:%S+') then -- if bare url with scheme; may have leading or trailing plain text
  269. scheme, domain = value:match ('(%a%S*:)(%S+)');
  270. elseif value:match ('//%S*%.%S+') then -- if protocol relative bare url: //yyyyy.zzz; may have leading or trailing plain text
  271. domain = value:match ('//(%S*%.%S+)'); -- what is left should be the domain
  272. else
  273. return false; -- didn't find anything that is obviously a url
  274. end
  275. return is_url (scheme, domain); -- return true if value appears to be a valid url
  276. end
  277. --[[-------------------------< C H E C K _ F O R _ U R L >-----------------------------------------------------
  278. loop through a list of parameters and their values. Look at the value and if it has an external link, emit an error message.
  279. ]]
  280. local function check_for_url (parameter_list)
  281. local error_message = '';
  282. for k, v in pairs (parameter_list) do -- for each parameter in the list
  283. if is_parameter_ext_wikilink (v) then -- look at the value; if there is a url add an error message
  284. if is_set(error_message) then -- once we've added the first portion of the error message ...
  285. error_message=error_message .. ", "; -- ... add a comma space separator
  286. end
  287. error_message=error_message .. "&#124;" .. k .. "="; -- add the failed parameter
  288. end
  289. end
  290. if is_set (error_message) then -- done looping, if there is an error message, display it
  291. table.insert( z.message_tail, { set_error( 'param_has_ext_link', {error_message}, true ) } );
  292. end
  293. end
  294. --[[--------------------------< S A F E _ F O R _ I T A L I C S >----------------------------------------------
  295. Protects a string that will be wrapped in wiki italic markup '' ... ''
  296. Note: We cannot use <i> for italics, as the expected behavior for italics specified by ''...'' in the title is that
  297. they will be inverted (i.e. unitalicized) in the resulting references. In addition, <i> and '' tend to interact
  298. poorly under Mediawiki's HTML tidy.
  299. ]]
  300. local function safe_for_italics( str )
  301. if not is_set(str) then
  302. return str;
  303. else
  304. if str:sub(1,1) == "'" then str = "<span></span>" .. str; end
  305. if str:sub(-1,-1) == "'" then str = str .. "<span></span>"; end
  306. -- Remove newlines as they break italics.
  307. return str:gsub( '\n', ' ' );
  308. end
  309. end
  310. --[[--------------------------< S A F E _ F O R _ U R L >------------------------------------------------------
  311. Escape sequences for content that will be used for URL descriptions
  312. ]]
  313. local function safe_for_url( str )
  314. if str:match( "%[%[.-%]%]" ) ~= nil then
  315. table.insert( z.message_tail, { set_error( 'wikilink_in_url', {}, true ) } );
  316. end
  317. return str:gsub( '[%[%]\n]', {
  318. ['['] = '&#91;',
  319. [']'] = '&#93;',
  320. ['\n'] = ' ' } );
  321. end
  322. --[[--------------------------< W R A P _ S T Y L E >----------------------------------------------------------
  323. Applies styling to various parameters. Supplied string is wrapped using a message_list configuration taking one
  324. argument; protects italic styled parameters. Additional text taken from citation_config.presentation - the reason
  325. this function is similar to but separate from wrap_msg().
  326. ]]
  327. local function wrap_style (key, str)
  328. if not is_set( str ) then
  329. return "";
  330. elseif in_array( key, { 'italic-title', 'trans-italic-title' } ) then
  331. str = safe_for_italics( str );
  332. end
  333. return substitute( cfg.presentation[key], {str} );
  334. end
  335. --[[--------------------------< E X T E R N A L _ L I N K >----------------------------------------------------
  336. Format an external link with error checking
  337. ]]
  338. local function external_link( URL, label, source )
  339. local error_str = "";
  340. if not is_set( label ) then
  341. label = URL;
  342. if is_set( source ) then
  343. error_str = set_error( 'bare_url_missing_title', { wrap_style ('parameter', source) }, false, " " );
  344. else
  345. error( cfg.messages["bare_url_no_origin"] );
  346. end
  347. end
  348. if not check_url( URL ) then
  349. error_str = set_error( 'bad_url', {wrap_style ('parameter', source)}, false, " " ) .. error_str;
  350. end
  351. return table.concat({ "[", URL, " ", safe_for_url( label ), "]", error_str });
  352. end
  353. --[[--------------------------< E X T E R N A L _ L I N K _ I D >----------------------------------------------
  354. Formats a wiki style external link
  355. ]]
  356. local function external_link_id(options)
  357. local url_string = options.id;
  358. if options.encode == true or options.encode == nil then
  359. url_string = mw.uri.encode( url_string );
  360. end
  361. return mw.ustring.format( '[%s%s%s \<span title\=\"%s\"\>%s%s%s\<\/span\>]',
  362. options.prefix, url_string, options.suffix or "",
  363. options.link, options.label, options.separator or "&nbsp;",
  364. mw.text.nowiki(options.id)
  365. );
  366. end
  367. --[[--------------------------< D E P R E C A T E D _ P A R A M E T E R >--------------------------------------
  368. Categorize and emit an error message when the citation contains one or more deprecated parameters. The function includes the
  369. offending parameter name to the error message. Only one error message is emitted regardless of the number of deprecated
  370. parameters in the citation.
  371. ]]
  372. local page_in_deprecated_cat; -- sticky flag so that the category is added only once
  373. local function deprecated_parameter(name)
  374. if not page_in_deprecated_cat then
  375. page_in_deprecated_cat = true; -- note that we've added this category
  376. table.insert( z.message_tail, { set_error( 'deprecated_params', {name}, true ) } ); -- add error message
  377. end
  378. end
  379. --[[--------------------------< K E R N _ Q U O T E S >--------------------------------------------------------
  380. Apply kerning to open the space between the quote mark provided by the Module and a leading or trailing quote mark contained in a |title= or |chapter= parameter's value.
  381. This function will positive kern either single or double quotes:
  382. "'Unkerned title with leading and trailing single quote marks'"
  383. " 'Kerned title with leading and trailing single quote marks' " (in real life the kerning isn't as wide as this example)
  384. Double single quotes (italic or bold wikimarkup) are not kerned.
  385. Call this function for chapter titles, for website titles, etc; not for book titles.
  386. ]]
  387. local function kern_quotes (str)
  388. local cap='';
  389. local cap2='';
  390. cap, cap2 = str:match ("^([\"\'])([^\'].+)"); -- match leading double or single quote but not double single quotes
  391. if is_set (cap) then
  392. str = substitute (cfg.presentation['kern-left'], {cap, cap2});
  393. end
  394. cap, cap2 = str:match ("^(.+[^\'])([\"\'])$")
  395. if is_set (cap) then
  396. str = substitute (cfg.presentation['kern-right'], {cap, cap2});
  397. end
  398. return str;
  399. end
  400. --[[--------------------------< F O R M A T _ S C R I P T _ V A L U E >----------------------------------------
  401. |script-title= holds title parameters that are not written in Latin based scripts: Chinese, Japanese, Arabic, Hebrew, etc. These scripts should
  402. not be italicized and may be written right-to-left. The value supplied by |script-title= is concatenated onto Title after Title has been wrapped
  403. in italic markup.
  404. Regardless of language, all values provided by |script-title= are wrapped in <bdi>...</bdi> tags to isolate rtl languages from the English left to right.
  405. |script-title= provides a unique feature. The value in |script-title= may be prefixed with a two-character ISO639-1 language code and a colon:
  406. |script-title=ja:*** *** (where * represents a Japanese character)
  407. Spaces between the two-character code and the colon and the colon and the first script character are allowed:
  408. |script-title=ja : *** ***
  409. |script-title=ja: *** ***
  410. |script-title=ja :*** ***
  411. Spaces preceding the prefix are allowed: |script-title = ja:*** ***
  412. The prefix is checked for validity. If it is a valid ISO639-1 language code, the lang attribute (lang="ja") is added to the <bdi> tag so that browsers can
  413. know the language the tag contains. This may help the browser render the script more correctly. If the prefix is invalid, the lang attribute
  414. is not added. At this time there is no error message for this condition.
  415. Supports |script-title= and |script-chapter=
  416. TODO: error messages when prefix is invalid ISO639-1 code; when script_value has prefix but no script;
  417. ]]
  418. local function format_script_value (script_value)
  419. local lang=''; -- initialize to empty string
  420. local name;
  421. if script_value:match('^%l%l%s*:') then -- if first 3 non-space characters are script language prefix
  422. lang = script_value:match('^(%l%l)%s*:%s*%S.*'); -- get the language prefix or nil if there is no script
  423. if not is_set (lang) then
  424. return ''; -- script_value was just the prefix so return empty string
  425. end
  426. -- if we get this far we have prefix and script
  427. name = mw.language.fetchLanguageName( lang, mw.getContentLanguage():getCode() ); -- get language name so that we can use it to categorize
  428. if is_set (name) then -- is prefix a proper ISO 639-1 language code?
  429. script_value = script_value:gsub ('^%l%l%s*:%s*', ''); -- strip prefix from script
  430. -- is prefix one of these language codes?
  431. if in_array (lang, {'ar', 'bg', 'bs', 'dv', 'el', 'fa', 'he', 'hy', 'ja', 'ka', 'ko', 'ku', 'mk', 'ps', 'ru', 'sd', 'sr', 'th', 'uk', 'ug', 'yi', 'zh'}) then
  432. add_prop_cat ('script_with_name', {name, lang})
  433. else
  434. add_prop_cat ('script')
  435. end
  436. lang = ' lang="' .. lang .. '" '; -- convert prefix into a lang attribute
  437. else
  438. lang = ''; -- invalid so set lang to empty string
  439. end
  440. end
  441. if is_set(script_value) then
  442. script_value = '-{R|' .. script_value .. '}-';
  443. end
  444. script_value = substitute (cfg.presentation['bdi'], {lang, script_value}); -- isolate in case script is rtl
  445. return script_value;
  446. end
  447. --[[--------------------------< S C R I P T _ C O N C A T E N A T E >------------------------------------------
  448. Initially for |title= and |script-title=, this function concatenates those two parameter values after the script value has been
  449. wrapped in <bdi> tags.
  450. ]]
  451. local function script_concatenate (title, script)
  452. if is_set(title) then
  453. title = '-{zh;zh-hans;zh-hant|' .. title .. '}-';
  454. end
  455. if is_set (script) then
  456. script = format_script_value (script); -- <bdi> tags, lang atribute, categorization, etc; returns empty string on error
  457. if is_set (script) then
  458. title = title .. ' ' .. script; -- concatenate title and script title
  459. end
  460. end
  461. return title;
  462. end
  463. --[[--------------------------< W R A P _ M S G >--------------------------------------------------------------
  464. Applies additional message text to various parameter values. Supplied string is wrapped using a message_list
  465. configuration taking one argument. Supports lower case text for {{citation}} templates. Additional text taken
  466. from citation_config.messages - the reason this function is similar to but separate from wrap_style().
  467. ]]
  468. local function wrap_msg (key, str, lower)
  469. if not is_set( str ) then
  470. return "";
  471. end
  472. if true == lower then
  473. local msg;
  474. msg = cfg.messages[key]:lower(); -- set the message to lower case before
  475. return substitute( msg, str ); -- including template text
  476. else
  477. return substitute( cfg.messages[key], str );
  478. end
  479. end
  480. --[[-------------------------< I S _ A L I A S _ U S E D >-----------------------------------------------------
  481. This function is used by select_one() to determine if one of a list of alias parameters is in the argument list
  482. provided by the template.
  483. Input:
  484. args – pointer to the arguments table from calling template
  485. alias – one of the list of possible aliases in the aliases lists from Module:Citation/CS1/Configuration
  486. index – for enumerated parameters, identifies which one
  487. enumerated – true/false flag used choose how enumerated aliases are examined
  488. value – value associated with an alias that has previously been selected; nil if not yet selected
  489. selected – the alias that has previously been selected; nil if not yet selected
  490. error_list – list of aliases that are duplicates of the alias already selected
  491. Returns:
  492. value – value associated with alias we selected or that was previously selected or nil if an alias not yet selected
  493. selected – the alias we selected or the alias that was previously selected or nil if an alias not yet selected
  494. ]]
  495. local function is_alias_used (args, alias, index, enumerated, value, selected, error_list)
  496. if enumerated then -- is this a test for an enumerated parameters?
  497. alias = alias:gsub ('#', index); -- replace '#' with the value in index
  498. else
  499. alias = alias:gsub ('#', ''); -- remove '#' if it exists
  500. end
  501. if is_set(args[alias]) then -- alias is in the template's argument list
  502. if value ~= nil and selected ~= alias then -- if we have already selected one of the aliases
  503. local skip;
  504. for _, v in ipairs(error_list) do -- spin through the error list to see if we've added this alias
  505. if v == alias then
  506. skip = true;
  507. break; -- has been added so stop looking
  508. end
  509. end
  510. if not skip then -- has not been added so
  511. table.insert( error_list, alias ); -- add error alias to the error list
  512. end
  513. else
  514. value = args[alias]; -- not yet selected an alias, so select this one
  515. selected = alias;
  516. end
  517. end
  518. return value, selected; -- return newly selected alias, or previously selected alias
  519. end
  520. --[[--------------------------< S E L E C T _ O N E >----------------------------------------------------------
  521. Chooses one matching parameter from a list of parameters to consider. The list of parameters to consider is just
  522. names. For parameters that may be enumerated, the position of the numerator in the parameter name is identified
  523. by the '#' so |author-last1= and |author1-last= are represented as 'author-last#' and 'author#-last'.
  524. Because enumerated parameter |<param>1= is an alias of |<param>= we must test for both possibilities.
  525. Generates an error if more than one match is present.
  526. ]]
  527. local function select_one( args, aliases_list, error_condition, index )
  528. local value = nil; -- the value assigned to the selected parameter
  529. local selected = ''; -- the name of the parameter we have chosen
  530. local error_list = {};
  531. if index ~= nil then index = tostring(index); end
  532. for _, alias in ipairs( aliases_list ) do -- for each alias in the aliases list
  533. if alias:match ('#') then -- if this alias can be enumerated
  534. if '1' == index then -- when index is 1 test for enumerated and non-enumerated aliases
  535. value, selected = is_alias_used (args, alias, index, false, value, selected, error_list); -- first test for non-enumerated alias
  536. end
  537. value, selected = is_alias_used (args, alias, index, true, value, selected, error_list); -- test for enumerated alias
  538. else
  539. value, selected = is_alias_used (args, alias, index, false, value, selected, error_list); --test for non-enumerated alias
  540. end
  541. end
  542. if #error_list > 0 and 'none' ~= error_condition then -- for cases where this code is used outside of extract_names()
  543. local error_str = "";
  544. for _, k in ipairs( error_list ) do
  545. if error_str ~= "" then error_str = error_str .. cfg.messages['parameter-separator'] end
  546. error_str = error_str .. wrap_style ('parameter', k);
  547. end
  548. if #error_list > 1 then
  549. error_str = error_str .. cfg.messages['parameter-final-separator'];
  550. else
  551. error_str = error_str .. cfg.messages['parameter-pair-separator'];
  552. end
  553. error_str = error_str .. wrap_style ('parameter', selected);
  554. table.insert( z.message_tail, { set_error( error_condition, {error_str}, true ) } );
  555. end
  556. return value, selected;
  557. end
  558. --[[--------------------------< F O R M A T _ C H A P T E R _ T I T L E >--------------------------------------
  559. Format the four chapter parameters: |script-chapter=, |chapter=, |trans-chapter=, and |chapter-url= into a single Chapter meta-
  560. parameter (chapter_url_source used for error messages).
  561. ]]
  562. local function format_chapter_title (scriptchapter, chapter, transchapter, chapterurl, chapter_url_source, no_quotes)
  563. local chapter_error = '';
  564. if not is_set (chapter) then
  565. chapter = ''; -- to be safe for concatenation
  566. else
  567. if false == no_quotes then
  568. chapter = kern_quotes (chapter); -- if necessary, separate chapter title's leading and trailing quote marks from Module provided quote marks
  569. chapter = wrap_style ('quoted-title', chapter);
  570. end
  571. end
  572. chapter = script_concatenate (chapter, scriptchapter) -- <bdi> tags, lang atribute, categorization, etc; must be done after title is wrapped
  573. if is_set (transchapter) then
  574. transchapter = wrap_style ('trans-quoted-title', transchapter);
  575. if is_set (chapter) then
  576. chapter = chapter .. ' ' .. transchapter;
  577. else -- here when transchapter without chapter or script-chapter
  578. chapter = transchapter; --
  579. chapter_error = ' ' .. set_error ('trans_missing_title', {'chapter'});
  580. end
  581. end
  582. if is_set (chapterurl) then
  583. chapter = external_link (chapterurl, chapter, chapter_url_source); -- adds bare_url_missing_title error if appropriate
  584. end
  585. return chapter .. chapter_error;
  586. end
  587. --[[--------------------------< H A S _ I N V I S I B L E _ C H A R S >----------------------------------------
  588. This function searches a parameter's value for nonprintable or invisible characters. The search stops at the
  589. first match.
  590. This function will detect the visible replacement character when it is part of the wikisource.
  591. Detects but ignores nowiki and math stripmarkers. Also detects other named stripmarkers (gallery, math, pre, ref)
  592. and identifies them with a slightly different error message. See also coins_cleanup().
  593. Detects but ignores the character pattern that results from the transclusion of {{'}} templates.
  594. Output of this function is an error message that identifies the character or the Unicode group, or the stripmarker
  595. that was detected along with its position (or, for multi-byte characters, the position of its first byte) in the
  596. parameter value.
  597. ]]
  598. local function has_invisible_chars (param, v)
  599. local position = ''; -- position of invisible char or starting position of stripmarker
  600. local dummy; -- end of matching string; not used but required to hold end position when a capture is returned
  601. local capture; -- used by stripmarker detection to hold name of the stripmarker
  602. local i=1;
  603. local stripmarker, apostrophe;
  604. while cfg.invisible_chars[i] do
  605. local char=cfg.invisible_chars[i][1] -- the character or group name
  606. local pattern=cfg.invisible_chars[i][2] -- the pattern used to find it
  607. position, dummy, capture = mw.ustring.find (v, pattern) -- see if the parameter value contains characters that match the pattern
  608. if position then
  609. if 'nowiki' == capture or 'math' == capture or -- nowiki and math stripmarkers (not an error condition)
  610. ('templatestyles' == capture) then -- templatestyles stripmarker allowed
  611. stripmarker = true; -- set a flag
  612. elseif true == stripmarker and 'delete' == char then -- because stripmakers begin and end with the delete char, assume that we've found one end of a stripmarker
  613. position = nil; -- unset
  614. elseif 'apostrophe' == char then -- apostrophe template uses &zwj;, hair space and zero-width space
  615. apostrophe = true;
  616. elseif true == apostrophe and in_array (char, {'zero width joiner', 'zero width space', 'hair space'}) then
  617. position = nil; -- unset
  618. else
  619. local err_msg;
  620. if capture then
  621. err_msg = capture .. ' ' .. cfg.invisible_chars[i][3] or char;
  622. else
  623. err_msg = cfg.invisible_chars[i][3] or (char .. ' character');
  624. end
  625. table.insert( z.message_tail, { set_error( 'invisible_char', {err_msg, wrap_style ('parameter', param), position}, true ) } ); -- add error message
  626. return; -- and done with this parameter
  627. end
  628. end
  629. i=i+1; -- bump our index
  630. end
  631. end
  632. --[[--------------------------< A R G U M E N T _ W R A P P E R >----------------------------------------------
  633. Argument wrapper. This function provides support for argument mapping defined in the configuration file so that
  634. multiple names can be transparently aliased to single internal variable.
  635. ]]
  636. local function argument_wrapper( args )
  637. local origin = {};
  638. return setmetatable({
  639. ORIGIN = function( self, k )
  640. local dummy = self[k]; --force the variable to be loaded.
  641. return origin[k];
  642. end
  643. },
  644. {
  645. __index = function ( tbl, k )
  646. if origin[k] ~= nil then
  647. return nil;
  648. end
  649. local args, list, v = args, cfg.aliases[k];
  650. if type( list ) == 'table' then
  651. v, origin[k] = select_one( args, list, 'redundant_parameters' );
  652. if origin[k] == nil then
  653. origin[k] = ''; -- Empty string, not nil
  654. end
  655. elseif list ~= nil then
  656. v, origin[k] = args[list], list;
  657. else
  658. -- maybe let through instead of raising an error?
  659. -- v, origin[k] = args[k], k;
  660. error( cfg.messages['unknown_argument_map'] );
  661. end
  662. -- Empty strings, not nil;
  663. if v == nil then
  664. v = cfg.defaults[k] or '';
  665. origin[k] = '';
  666. end
  667. tbl = rawset( tbl, k, v );
  668. return v;
  669. end,
  670. });
  671. end
  672. --[[--------------------------< V A L I D A T E >--------------------------------------------------------------
  673. Looks for a parameter's name in the whitelist.
  674. Parameters in the whitelist can have three values:
  675. true - active, supported parameters
  676. false - deprecated, supported parameters
  677. nil - unsupported parameters
  678. ]]
  679. local function validate( name )
  680. local name = tostring( name );
  681. local state = whitelist.basic_arguments[ name ];
  682. -- Normal arguments
  683. if true == state then return true; end -- valid actively supported parameter
  684. if false == state then
  685. deprecated_parameter (name); -- parameter is deprecated but still supported
  686. return true;
  687. end
  688. -- Arguments with numbers in them
  689. name = name:gsub( "%d+", "#" ); -- replace digit(s) with # (last25 becomes last#
  690. state = whitelist.numbered_arguments[ name ];
  691. if true == state then return true; end -- valid actively supported parameter
  692. if false == state then
  693. deprecated_parameter (name); -- parameter is deprecated but still supported
  694. return true;
  695. end
  696. return false; -- Not supported because not found or name is set to nil
  697. end
  698. -- Formats a wiki style internal link
  699. local function internal_link_id(options)
  700. return mw.ustring.format( '[[%s%s%s|\<span title\=\"%s\"\>%s\<\/span\>%s%s]]',
  701. options.prefix, options.id, options.suffix or "",
  702. options.link, options.label, options.separator or "&nbsp;",
  703. mw.text.nowiki(options.id)
  704. );
  705. end
  706. --[[--------------------------< N O W R A P _ D A T E >--------------------------------------------------------
  707. When date is YYYY-MM-DD format wrap in nowrap span: <span ...>YYYY-MM-DD</span>. When date is DD MMMM YYYY or is
  708. MMMM DD, YYYY then wrap in nowrap span: <span ...>DD MMMM</span> YYYY or <span ...>MMMM DD,</span> YYYY
  709. DOES NOT yet support MMMM YYYY or any of the date ranges.
  710. ]]
  711. local function nowrap_date (date)
  712. local cap='';
  713. local cap2='';
  714. if date:match("^%d%d%d%d%-%d%d%?-%d%d?$") then
  715. local y, m, d = date:match("(%d%d%d%d)%-(%d%d?)%-(%d%d?)$")
  716. date = substitute (cfg.presentation['nowrap1'], y..'-'..string.format('%02d', m)..'-'..string.format('%02d', d));
  717. elseif date:match("^%a+%s*%d%d?,%s+%d%d%d%d$") or date:match ("^%d%d?%s*%a+%s+%d%d%d%d$") then
  718. cap, cap2 = string.match (date, "^(.*)%s+(%d%d%d%d)$");
  719. date = substitute (cfg.presentation['nowrap2'], {cap, cap2});
  720. end
  721. return date;
  722. end
  723. --[[--------------------------< IS _ V A L I D _ I S X N >-----------------------------------------------------
  724. ISBN-10 and ISSN validator code calculates checksum across all isbn/issn digits including the check digit. ISBN-13 is checked in check_isbn().
  725. If the number is valid the result will be 0. Before calling this function, issbn/issn must be checked for length and stripped of dashes,
  726. spaces and other non-isxn characters.
  727. ]]
  728. local function is_valid_isxn (isxn_str, len)
  729. local temp = 0;
  730. isxn_str = { isxn_str:byte(1, len) }; -- make a table of byte values '0' → 0x30 .. '9' → 0x39, 'X' → 0x58
  731. len = len+1; -- adjust to be a loop counter
  732. for i, v in ipairs( isxn_str ) do -- loop through all of the bytes and calculate the checksum
  733. if v == string.byte( "X" ) then -- if checkdigit is X (compares the byte value of 'X' which is 0x58)
  734. temp = temp + 10*( len - i ); -- it represents 10 decimal
  735. else
  736. temp = temp + tonumber( string.char(v) )*(len-i);
  737. end
  738. end
  739. return temp % 11 == 0; -- returns true if calculation result is zero
  740. end
  741. --[[--------------------------< IS _ V A L I D _ I S X N _ 1 3 >----------------------------------------------
  742. ISBN-13 and ISMN validator code calculates checksum across all 13 isbn/ismn digits including the check digit.
  743. If the number is valid, the result will be 0. Before calling this function, isbn-13/ismn must be checked for length
  744. and stripped of dashes, spaces and other non-isxn-13 characters.
  745. ]]
  746. local function is_valid_isxn_13 (isxn_str)
  747. local temp=0;
  748. isxn_str = { isxn_str:byte(1, 13) }; -- make a table of byte values '0' → 0x30 .. '9' → 0x39
  749. for i, v in ipairs( isxn_str ) do
  750. temp = temp + (3 - 2*(i % 2)) * tonumber( string.char(v) ); -- multiply odd index digits by 1, even index digits by 3 and sum; includes check digit
  751. end
  752. return temp % 10 == 0; -- sum modulo 10 is zero when isbn-13/ismn is correct
  753. end
  754. --[[--------------------------< C H E C K _ I S B N >------------------------------------------------------------
  755. Determines whether an ISBN string is valid
  756. ]]
  757. local function check_isbn( isbn_str )
  758. if nil ~= isbn_str:match("[^%s-0-9X]") then return false; end -- fail if isbn_str contains anything but digits, hyphens, or the uppercase X
  759. isbn_str = isbn_str:gsub( "-", "" ):gsub( " ", "" ); -- remove hyphens and spaces
  760. local len = isbn_str:len();
  761. if len ~= 10 and len ~= 13 then
  762. return false;
  763. end
  764. if len == 10 then
  765. if isbn_str:match( "^%d*X?$" ) == nil then return false; end
  766. return is_valid_isxn(isbn_str, 10);
  767. else
  768. local temp = 0;
  769. if isbn_str:match( "^97[89]%d*$" ) == nil then return false; end -- isbn13 begins with 978 or 979; ismn begins with 979
  770. return is_valid_isxn_13 (isbn_str);
  771. end
  772. end
  773. --[[--------------------------< C H E C K _ I S M N >------------------------------------------------------------
  774. Determines whether an ISMN string is valid. Similar to isbn-13, ismn is 13 digits begining 979-0-... and uses the
  775. same check digit calculations. See http://www.ismn-international.org/download/Web_ISMN_Users_Manual_2008-6.pdf
  776. section 2, pages 9–12.
  777. ]]
  778. local function ismn (id)
  779. local handler = cfg.id_handlers['ISMN'];
  780. local text;
  781. local valid_ismn = true;
  782. id=id:gsub( "[%s-–]", "" ); -- strip spaces, hyphens, and endashes from the ismn
  783. if 13 ~= id:len() or id:match( "^9790%d*$" ) == nil then -- ismn must be 13 digits and begin 9790
  784. valid_ismn = false;
  785. else
  786. valid_ismn=is_valid_isxn_13 (id); -- validate ismn
  787. end
  788. -- text = internal_link_id({link = handler.link, label = handler.label, -- use this (or external version) when there is some place to link to
  789. -- prefix=handler.prefix,id=id,separator=handler.separator, encode=handler.encode})
  790. text="[[" .. handler.link .. "|" .. handler.label .. "]]" .. handler.separator .. id; -- because no place to link to yet
  791. if false == valid_ismn then
  792. text = text .. ' ' .. set_error( 'bad_ismn' ) -- add an error message if the issn is invalid
  793. end
  794. return text;
  795. end
  796. --[[--------------------------< I S S N >----------------------------------------------------------------------
  797. Validate and format an issn. This code fixes the case where an editor has included an ISSN in the citation but has separated the two groups of four
  798. digits with a space. When that condition occurred, the resulting link looked like this:
  799. |issn=0819 4327 gives: [http://www.worldcat.org/issn/0819 4327 0819 4327] -- can't have spaces in an external link
  800. This code now prevents that by inserting a hyphen at the issn midpoint. It also validates the issn for length and makes sure that the checkdigit agrees
  801. with the calculated value. Incorrect length (8 digits), characters other than 0-9 and X, or checkdigit / calculated value mismatch will all cause a check issn
  802. error message. The issn is always displayed with a hyphen, even if the issn was given as a single group of 8 digits.
  803. ]]
  804. local function issn(id, e)
  805. local issn_copy = id; -- save a copy of unadulterated issn; use this version for display if issn does not validate
  806. local handler;
  807. local text;
  808. local valid_issn = true;
  809. if e then
  810. handler = cfg.id_handlers['EISSN'];
  811. else
  812. handler = cfg.id_handlers['ISSN'];
  813. end
  814. id=id:gsub( "[%s-–]", "" ); -- strip spaces, hyphens, and endashes from the issn
  815. if 8 ~= id:len() or nil == id:match( "^%d*X?$" ) then -- validate the issn: 8 digits long, containing only 0-9 or X in the last position
  816. valid_issn=false; -- wrong length or improper character
  817. else
  818. valid_issn=is_valid_isxn(id, 8); -- validate issn
  819. end
  820. if true == valid_issn then
  821. id = string.sub( id, 1, 4 ) .. "-" .. string.sub( id, 5 ); -- if valid, display correctly formatted version
  822. else
  823. id = issn_copy; -- if not valid, use the show the invalid issn with error message
  824. end
  825. text = external_link_id({link = handler.link, label = handler.label,
  826. prefix=handler.prefix,id=id,separator=handler.separator, encode=handler.encode})
  827. if false == valid_issn then
  828. text = text .. ' ' .. set_error( 'bad_issn' ) -- add an error message if the issn is invalid
  829. end
  830. return text
  831. end
  832. --[[--------------------------< A M A Z O N >------------------------------------------------------------------
  833. Formats a link to Amazon. Do simple error checking: asin must be mix of 10 numeric or uppercase alpha
  834. characters. If a mix, first character must be uppercase alpha; if all numeric, asins must be 10-digit
  835. isbn. If 10-digit isbn, add a maintenance category so a bot or awb script can replace |asin= with |isbn=.
  836. Error message if not 10 characters, if not isbn10, if mixed and first character is a digit.
  837. ]]
  838. local function amazon(id, domain)
  839. local err_cat = ""
  840. if not id:match("^[%d%u][%d%u][%d%u][%d%u][%d%u][%d%u][%d%u][%d%u][%d%u][%d%u]$") then
  841. err_cat = ' ' .. set_error ('bad_asin'); -- asin is not a mix of 10 uppercase alpha and numeric characters
  842. else
  843. if id:match("^%d%d%d%d%d%d%d%d%d[%dX]$") then -- if 10-digit numeric (or 9 digits with terminal X)
  844. if check_isbn( id ) then -- see if asin value is isbn10
  845. add_maint_cat ('ASIN');
  846. elseif not is_set (err_cat) then
  847. err_cat = ' ' .. set_error ('bad_asin'); -- asin is not isbn10
  848. end
  849. elseif not id:match("^%u[%d%u]+$") then
  850. err_cat = ' ' .. set_error ('bad_asin'); -- asin doesn't begin with uppercase alpha
  851. end
  852. end
  853. if not is_set(domain) then
  854. domain = "com";
  855. elseif in_array (domain, {'jp', 'uk'}) then -- Japan, United Kingdom
  856. domain = "co." .. domain;
  857. elseif in_array (domain, {'au', 'br', 'mx'}) then -- Australia, Brazil, Mexico
  858. domain = "com." .. domain;
  859. end
  860. local handler = cfg.id_handlers['ASIN'];
  861. return external_link_id({link=handler.link,
  862. label=handler.label, prefix=handler.prefix .. domain .. "/dp/",
  863. id=id, encode=handler.encode, separator = handler.separator}) .. err_cat;
  864. end
  865. --[[--------------------------< A R X I V >--------------------------------------------------------------------
  866. See: http://arxiv.org/help/arxiv_identifier
  867. format and error check arXiv identifier. There are three valid forms of the identifier:
  868. the first form, valid only between date codes 9108 and 0703 is:
  869. arXiv:<archive>.<class>/<date code><number><version>
  870. where:
  871. <archive> is a string of alpha characters - may be hyphenated; no other punctuation
  872. <class> is a string of alpha characters - may be hyphenated; no other punctuation
  873. <date code> is four digits in the form YYMM where YY is the last two digits of the four-digit year and MM is the month number January = 01
  874. first digit of YY for this form can only 9 and 0
  875. <number> is a three-digit number
  876. <version> is a 1 or more digit number preceded with a lowercase v; no spaces (undocumented)
  877. the second form, valid from April 2007 through December 2014 is:
  878. arXiv:<date code>.<number><version>
  879. where:
  880. <date code> is four digits in the form YYMM where YY is the last two digits of the four-digit year and MM is the month number January = 01
  881. <number> is a four-digit number
  882. <version> is a 1 or more digit number preceded with a lowercase v; no spaces
  883. the third form, valid from January 2015 is:
  884. arXiv:<date code>.<number><version>
  885. where:
  886. <date code> and <version> are as defined for 0704-1412
  887. <number> is a five-digit number
  888. ]]
  889. local function arxiv (id, class)
  890. local handler = cfg.id_handlers['ARXIV'];
  891. local year, month, version;
  892. local err_cat = '';
  893. local text;
  894. if id:match("^%a[%a%.%-]+/[90]%d[01]%d%d%d%d$") or id:match("^%a[%a%.%-]+/[90]%d[01]%d%d%d%dv%d+$") then -- test for the 9108-0703 format w/ & w/o version
  895. year, month = id:match("^%a[%a%.%-]+/([90]%d)([01]%d)%d%d%d[v%d]*$");
  896. year = tonumber(year);
  897. month = tonumber(month);
  898. if ((not (90 < year or 8 > year)) or (1 > month or 12 < month)) or -- if invalid year or invalid month
  899. ((91 == year and 7 > month) or (7 == year and 3 < month)) then -- if years ok, are starting and ending months ok?
  900. err_cat = ' ' .. set_error( 'bad_arxiv' ); -- set error message
  901. end
  902. elseif id:match("^%d%d[01]%d%.%d%d%d%d$") or id:match("^%d%d[01]%d%.%d%d%d%dv%d+$") then -- test for the 0704-1412 w/ & w/o version
  903. year, month = id:match("^(%d%d)([01]%d)%.%d%d%d%d[v%d]*$");
  904. year = tonumber(year);
  905. month = tonumber(month);
  906. if ((7 > year) or (14 < year) or (1 > month or 12 < month)) or -- is year invalid or is month invalid? (doesn't test for future years)
  907. ((7 == year) and (4 > month)) then --or -- when year is 07, is month invalid (before April)?
  908. err_cat = ' ' .. set_error( 'bad_arxiv' ); -- set error message
  909. end
  910. elseif id:match("^%d%d[01]%d%.%d%d%d%d%d$") or id:match("^%d%d[01]%d%.%d%d%d%d%dv%d+$") then -- test for the 1501- format w/ & w/o version
  911. year, month = id:match("^(%d%d)([01]%d)%.%d%d%d%d%d[v%d]*$");
  912. year = tonumber(year);
  913. month = tonumber(month);
  914. if ((15 > year) or (1 > month or 12 < month)) then -- is year invalid or is month invalid? (doesn't test for future years)
  915. err_cat = ' ' .. set_error( 'bad_arxiv' ); -- set error message
  916. end
  917. else
  918. err_cat = ' ' .. set_error( 'bad_arxiv' ); -- arXiv id doesn't match any format
  919. end
  920. text = external_link_id({link = handler.link, label = handler.label,
  921. prefix=handler.prefix,id=id,separator=handler.separator, encode=handler.encode}) .. err_cat;
  922. if is_set (class) then
  923. class = ' [[' .. '//arxiv.org/archive/' .. class .. ' ' .. class .. ']]'; -- external link within square brackets, not wikilink
  924. else
  925. class = ''; -- empty string for concatenation
  926. end
  927. return text .. class;
  928. end
  929. --[[
  930. lccn normalization (http://www.loc.gov/marc/lccn-namespace.html#normalization)
  931. 1. Remove all blanks.
  932. 2. If there is a forward slash (/) in the string, remove it, and remove all characters to the right of the forward slash.
  933. 3. If there is a hyphen in the string:
  934. a. Remove it.
  935. b. Inspect the substring following (to the right of) the (removed) hyphen. Then (and assuming that steps 1 and 2 have been carried out):
  936. 1. All these characters should be digits, and there should be six or less. (not done in this function)
  937. 2. If the length of the substring is less than 6, left-fill the substring with zeroes until the length is six.
  938. Returns a normalized lccn for lccn() to validate. There is no error checking (step 3.b.1) performed in this function.
  939. ]]
  940. local function normalize_lccn (lccn)
  941. lccn = lccn:gsub ("%s", ""); -- 1. strip whitespace
  942. if nil ~= string.find (lccn,'/') then
  943. lccn = lccn:match ("(.-)/"); -- 2. remove forward slash and all character to the right of it
  944. end
  945. local prefix
  946. local suffix
  947. prefix, suffix = lccn:match ("(.+)%-(.+)"); -- 3.a remove hyphen by splitting the string into prefix and suffix
  948. if nil ~= suffix then -- if there was a hyphen
  949. suffix=string.rep("0", 6-string.len (suffix)) .. suffix; -- 3.b.2 left fill the suffix with 0s if suffix length less than 6
  950. lccn=prefix..suffix; -- reassemble the lccn
  951. end
  952. return lccn;
  953. end
  954. --[[
  955. Format LCCN link and do simple error checking. LCCN is a character string 8-12 characters long. The length of the LCCN dictates the character type of the first 1-3 characters; the
  956. rightmost eight are always digits. http://info-uri.info/registry/OAIHandler?verb=GetRecord&metadataPrefix=reg&identifier=info:lccn/
  957. length = 8 then all digits
  958. length = 9 then lccn[1] is lower case alpha
  959. length = 10 then lccn[1] and lccn[2] are both lower case alpha or both digits
  960. length = 11 then lccn[1] is lower case alpha, lccn[2] and lccn[3] are both lower case alpha or both digits
  961. length = 12 then lccn[1] and lccn[2] are both lower case alpha
  962. ]]
  963. local function lccn(lccn)
  964. local handler = cfg.id_handlers['LCCN'];
  965. local err_cat = ''; -- presume that LCCN is valid
  966. local id = lccn; -- local copy of the lccn
  967. id = normalize_lccn (id); -- get canonical form (no whitespace, hyphens, forward slashes)
  968. local len = id:len(); -- get the length of the lccn
  969. if 8 == len then
  970. if id:match("[^%d]") then -- if LCCN has anything but digits (nil if only digits)
  971. err_cat = ' ' .. set_error( 'bad_lccn' ); -- set an error message
  972. end
  973. elseif 9 == len then -- LCCN should be adddddddd
  974. if nil == id:match("%l%d%d%d%d%d%d%d%d") then -- does it match our pattern?
  975. err_cat = ' ' .. set_error( 'bad_lccn' ); -- set an error message
  976. end
  977. elseif 10 == len then -- LCCN should be aadddddddd or dddddddddd
  978. if id:match("[^%d]") then -- if LCCN has anything but digits (nil if only digits) ...
  979. if nil == id:match("^%l%l%d%d%d%d%d%d%d%d") then -- ... see if it matches our pattern
  980. err_cat = ' ' .. set_error( 'bad_lccn' ); -- no match, set an error message
  981. end
  982. end
  983. elseif 11 == len then -- LCCN should be aaadddddddd or adddddddddd
  984. if not (id:match("^%l%l%l%d%d%d%d%d%d%d%d") or id:match("^%l%d%d%d%d%d%d%d%d%d%d")) then -- see if it matches one of our patterns
  985. err_cat = ' ' .. set_error( 'bad_lccn' ); -- no match, set an error message
  986. end
  987. elseif 12 == len then -- LCCN should be aadddddddddd
  988. if not id:match("^%l%l%d%d%d%d%d%d%d%d%d%d") then -- see if it matches our pattern
  989. err_cat = ' ' .. set_error( 'bad_lccn' ); -- no match, set an error message
  990. end
  991. else
  992. err_cat = ' ' .. set_error( 'bad_lccn' ); -- wrong length, set an error message
  993. end
  994. if not is_set (err_cat) and nil ~= lccn:find ('%s') then
  995. err_cat = ' ' .. set_error( 'bad_lccn' ); -- lccn contains a space, set an error message
  996. end
  997. return external_link_id({link = handler.link, label = handler.label,
  998. prefix=handler.prefix,id=lccn,separator=handler.separator, encode=handler.encode}) .. err_cat;
  999. end
  1000. --[[--------------------------< P M I D >----------------------------------------------------------------------
  1001. Format PMID and do simple error checking. PMIDs are sequential numbers beginning at 1 and counting up. This
  1002. code checks the PMID to see that it contains only digits and is less than test_limit; the value in local variable
  1003. test_limit will need to be updated periodically as more PMIDs are issued.
  1004. ]]
  1005. local function pmid(id)
  1006. local test_limit = 33000000; -- update this value as PMIDs approach
  1007. local handler = cfg.id_handlers['PMID'];
  1008. local err_cat = ''; -- presume that PMID is valid
  1009. if id:match("[^%d]") then -- if PMID has anything but digits
  1010. err_cat = ' ' .. set_error( 'bad_pmid' ); -- set an error message
  1011. else -- PMID is only digits
  1012. local id_num = tonumber(id); -- convert id to a number for range testing
  1013. if 1 > id_num or test_limit < id_num then -- if PMID is outside test limit boundaries
  1014. err_cat = ' ' .. set_error( 'bad_pmid' ); -- set an error message
  1015. end
  1016. end
  1017. return external_link_id({link = handler.link, label = handler.label,
  1018. prefix=handler.prefix,id=id,separator=handler.separator, encode=handler.encode}) .. err_cat;
  1019. end
  1020. --[[--------------------------< I S _ E M B A R G O E D >------------------------------------------------------
  1021. Determines if a PMC identifier's online version is embargoed. Compares the date in |embargo= against today's date. If embargo date is
  1022. in the future, returns the content of |embargo=; otherwise, returns and empty string because the embargo has expired or because
  1023. |embargo= was not set in this cite.
  1024. ]]
  1025. local function is_embargoed (embargo)
  1026. if is_set (embargo) then
  1027. local lang = mw.getContentLanguage();
  1028. local good1, embargo_date, good2, todays_date;
  1029. good1, embargo_date = pcall( lang.formatDate, lang, 'U', embargo );
  1030. good2, todays_date = pcall( lang.formatDate, lang, 'U' );
  1031. if good1 and good2 then -- if embargo date and today's date are good dates
  1032. if tonumber( embargo_date ) >= tonumber( todays_date ) then -- is embargo date is in the future?
  1033. return embargo; -- still embargoed
  1034. else
  1035. add_maint_cat ('embargo')
  1036. return ''; -- unset because embargo has expired
  1037. end
  1038. end
  1039. end
  1040. return ''; -- |embargo= not set return empty string
  1041. end
  1042. --[[--------------------------< P M C >------------------------------------------------------------------------
  1043. Format a PMC, do simple error checking, and check for embargoed articles.
  1044. The embargo parameter takes a date for a value. If the embargo date is in the future the PMC identifier will not
  1045. be linked to the article. If the embargo date is today or in the past, or if it is empty or omitted, then the
  1046. PMC identifier is linked to the article through the link at cfg.id_handlers['PMC'].prefix.
  1047. PMC embargo date testing is done in function is_embargoed () which is called earlier because when the citation
  1048. has |pmc=<value> but does not have a |url= then |title= is linked with the PMC link. Function is_embargoed ()
  1049. returns the embargo date if the PMC article is still embargoed, otherwise it returns an empty string.
  1050. PMCs are sequential numbers beginning at 1 and counting up. This code checks the PMC to see that it contains only digits and is less
  1051. than test_limit; the value in local variable test_limit will need to be updated periodically as more PMCs are issued.
  1052. ]]
  1053. local function pmc(id, embargo)
  1054. local test_limit = 7000000; -- update this value as PMCs approach
  1055. local handler = cfg.id_handlers['PMC'];
  1056. local err_cat = ''; -- presume that PMC is valid
  1057. local id_num;
  1058. local text;
  1059. id_num = id:match ('^[Pp][Mm][Cc](%d+)$'); -- identifier with pmc prefix
  1060. if is_set (id_num) then
  1061. add_maint_cat ('pmc_format');
  1062. else -- plain number without pmc prefix
  1063. id_num = id:match ('^%d+$'); -- if here id is all digits
  1064. end
  1065. if is_set (id_num) then -- id_num has a value so test it
  1066. id_num = tonumber(id_num); -- convert id_num to a number for range testing
  1067. if 1 > id_num or test_limit < id_num then -- if PMC is outside test limit boundaries
  1068. err_cat = ' ' .. set_error( 'bad_pmc' ); -- set an error message
  1069. else
  1070. id = tostring (id_num); -- make sure id is a string
  1071. end
  1072. else -- when id format incorrect
  1073. err_cat = ' ' .. set_error( 'bad_pmc' ); -- set an error message
  1074. end
  1075. if is_set (embargo) then -- is PMC is still embargoed?
  1076. text = table.concat ( -- still embargoed so no external link
  1077. {
  1078. make_wikilink (handler.link, handler.label),
  1079. handler.separator,
  1080. id,
  1081. err_cat
  1082. });
  1083. else
  1084. text = external_link_id({link = handler.link, label = handler.label, -- no embargo date or embargo has expired, ok to link to article
  1085. prefix=handler.prefix,id=id,separator=handler.separator, encode=handler.encode}) .. err_cat;
  1086. end
  1087. return text;
  1088. end
  1089. --[[--------------------------< D O I >------------------------------------------------------------------------
  1090. Formats a DOI and checks for DOI errors.
  1091. DOI names contain two parts: prefix and suffix separated by a forward slash.
  1092. Prefix: directory indicator '10.' followed by a registrant code
  1093. Suffix: character string of any length chosen by the registrant
  1094. This function checks a DOI name for: prefix/suffix. If the doi name contains spaces or endashes, or, if it ends
  1095. with a period or a comma, this function will emit a bad_doi error message.
  1096. DOI names are case-insensitive and can incorporate any printable Unicode characters so the test for spaces, endash,
  1097. and terminal punctuation may not be technically correct but it appears, that in practice these characters are rarely
  1098. if ever used in doi names.
  1099. ]]
  1100. local function doi(id, inactive)
  1101. local cat = ""
  1102. local handler = cfg.id_handlers['DOI'];
  1103. local text;
  1104. if is_set(inactive) then
  1105. local inactive_year = inactive:match("%d%d%d%d") or ''; -- try to get the year portion from the inactive date
  1106. if is_set(inactive_year) then
  1107. table.insert( z.error_categories, "自" .. inactive_year .. "年含有不活躍DOI的頁面" );
  1108. else
  1109. table.insert( z.error_categories, "含有不活躍DOI的頁面" ); -- when inactive doesn't contain a recognizable year
  1110. end
  1111. inactive = " (" .. cfg.messages['inactive'] .. " " .. inactive .. ")"
  1112. end
  1113. text = external_link_id({link = handler.link, label = handler.label,
  1114. prefix=handler.prefix,id=id,separator=handler.separator, encode=handler.encode}) .. (inactive or '')
  1115. if nil == id:match("^10%.[^%s–]-/[^%s–]-[^%.,]$") then -- doi must begin with '10.', must contain a fwd slash, must not contain spaces or endashes, and must not end with period or comma
  1116. cat = ' ' .. set_error( 'bad_doi' );
  1117. end
  1118. return text .. cat
  1119. end
  1120. --[[--------------------------< H D L >------------------------------------------------------------------------
  1121. Formats an HDL with minor error checking.
  1122. HDL names contain two parts: prefix and suffix separated by a forward slash.
  1123. Prefix: character string using any character in the UCS-2 character set except '/'
  1124. Suffix: character string of any length using any character in the UCS-2 character set chosen by the registrant
  1125. This function checks a HDL name for: prefix/suffix. If the HDL name contains spaces, endashes, or, if it ends
  1126. with a period or a comma, this function will emit a bad_hdl error message.
  1127. HDL names are case-insensitive and can incorporate any printable Unicode characters so the test for endashes and
  1128. terminal punctuation may not be technically correct but it appears, that in practice these characters are rarely
  1129. if ever used in HDLs.
  1130. ]]
  1131. local function hdl(id)
  1132. local handler = cfg.id_handlers['HDL'];
  1133. local text = external_link_id({link = handler.link, label = handler.label,
  1134. prefix=handler.prefix,id=id,separator=handler.separator, encode=handler.encode})
  1135. if nil == id:match("^[^%s–]-/[^%s–]-[^%.,]$") then -- hdl must contain a fwd slash, must not contain spaces, endashes, and must not end with period or comma
  1136. text = text .. ' ' .. set_error( 'bad_hdl' );
  1137. end
  1138. return text;
  1139. end
  1140. --[[--------------------------< O P E N L I B R A R Y >--------------------------------------------------------
  1141. Formats an OpenLibrary link, and checks for associated errors.
  1142. ]]
  1143. local function openlibrary(id)
  1144. local code = id:match("^%d+([AMW])$"); -- only digits followed by 'A', 'M', or 'W'
  1145. local handler = cfg.id_handlers['OL'];
  1146. if ( code == "A" ) then
  1147. return external_link_id({link=handler.link, label=handler.label,
  1148. prefix=handler.prefix .. 'authors/OL',
  1149. id=id, separator=handler.separator, encode = handler.encode})
  1150. elseif ( code == "M" ) then
  1151. return external_link_id({link=handler.link, label=handler.label,
  1152. prefix=handler.prefix .. 'books/OL',
  1153. id=id, separator=handler.separator, encode = handler.encode})
  1154. elseif ( code == "W" ) then
  1155. return external_link_id({link=handler.link, label=handler.label,
  1156. prefix=handler.prefix .. 'works/OL',
  1157. id=id, separator=handler.separator, encode = handler.encode})
  1158. else
  1159. return external_link_id({link=handler.link, label=handler.label,
  1160. prefix=handler.prefix .. 'OL',
  1161. id=id, separator=handler.separator, encode = handler.encode}) .. ' ' .. set_error( 'bad_ol' );
  1162. end
  1163. end
  1164. --[[--------------------------< M E S S A G E _ I D >----------------------------------------------------------
  1165. Validate and format a usenet message id. Simple error checking, looks for 'id-left@id-right' not enclosed in
  1166. '<' and/or '>' angle brackets.
  1167. ]]
  1168. local function message_id (id)
  1169. local handler = cfg.id_handlers['USENETID'];
  1170. text = external_link_id({link = handler.link, label = handler.label,
  1171. prefix=handler.prefix,id=id,separator=handler.separator, encode=handler.encode})
  1172. if not id:match('^.+@.+$') or not id:match('^[^<].*[^>]$')then -- doesn't have '@' or has one or first or last character is '< or '>'
  1173. text = text .. ' ' .. set_error( 'bad_message_id' ) -- add an error message if the message id is invalid
  1174. end
  1175. return text
  1176. end
  1177. --[[--------------------------< S E T _ T I T L E T Y P E >----------------------------------------------------
  1178. This function sets default title types (equivalent to the citation including |type=<default value>) for those templates that have defaults.
  1179. Also handles the special case where it is desirable to omit the title type from the rendered citation (|type=none).
  1180. ]]
  1181. local function set_titletype (cite_class, title_type)
  1182. if is_set(title_type) then
  1183. if "none" == title_type then
  1184. title_type = ""; -- if |type=none then type parameter not displayed
  1185. end
  1186. return title_type; -- if |type= has been set to any other value use that value
  1187. end
  1188. return cfg.title_types [cite_class] or ''; -- set template's default title type; else empty string for concatenation
  1189. end
  1190. --[[--------------------------< C L E A N _ I S B N >----------------------------------------------------------
  1191. Removes irrelevant text and dashes from ISBN number
  1192. Similar to that used for Special:BookSources
  1193. ]]
  1194. local function clean_isbn( isbn_str )
  1195. return isbn_str:gsub( "[^-0-9X]", "" );
  1196. end
  1197. --[[--------------------------< E S C A P E _ L U A _ M A G I C _ C H A R S >----------------------------------
  1198. Returns a string where all of lua's magic characters have been escaped. This is important because functions like
  1199. string.gsub() treat their pattern and replace strings as patterns, not literal strings.
  1200. ]]
  1201. local function escape_lua_magic_chars (argument)
  1202. argument = argument:gsub("%%", "%%%%"); -- replace % with %%
  1203. argument = argument:gsub("([%^%$%(%)%.%[%]%*%+%-%?])", "%%%1"); -- replace all other lua magic pattern characters
  1204. return argument;
  1205. end
  1206. --[[--------------------------< S T R I P _ A P O S T R O P H E _ M A R K U P >--------------------------------
  1207. Strip wiki italic and bold markup from argument so that it doesn't contaminate COinS metadata.
  1208. This function strips common patterns of apostrophe markup. We presume that editors who have taken the time to
  1209. markup a title have, as a result, provided valid markup. When they don't, some single apostrophes are left behind.
  1210. ]]
  1211. local function strip_apostrophe_markup (argument)
  1212. if not is_set (argument) then return argument; end
  1213. while true do
  1214. if argument:match ("%'%'%'%'%'") then -- bold italic (5)
  1215. argument=argument:gsub("%'%'%'%'%'", ""); -- remove all instances of it
  1216. elseif argument:match ("%'%'%'%'") then -- italic start and end without content (4)
  1217. argument=argument:gsub("%'%'%'%'", "");
  1218. elseif argument:match ("%'%'%'") then -- bold (3)
  1219. argument=argument:gsub("%'%'%'", "");
  1220. elseif argument:match ("%'%'") then -- italic (2)
  1221. argument=argument:gsub("%'%'", "");
  1222. else
  1223. break;
  1224. end
  1225. end
  1226. return argument; -- done
  1227. end
  1228. --[[--------------------------< M A K E _ C O I N S _ T I T L E >----------------------------------------------
  1229. Makes a title for COinS from Title and / or ScriptTitle (or any other name-script pairs)
  1230. Apostrophe markup (bold, italics) is stripped from each value so that the COinS metadata isn't correupted with strings
  1231. of %27%27...
  1232. ]]
  1233. local function make_coins_title (title, script)
  1234. if is_set (title) then
  1235. title = strip_apostrophe_markup (title); -- strip any apostrophe markup
  1236. else
  1237. title=''; -- if not set, make sure title is an empty string
  1238. end
  1239. if is_set (script) then
  1240. script = script:gsub ('^%l%l%s*:%s*', ''); -- remove language prefix if present (script value may now be empty string)
  1241. script = strip_apostrophe_markup (script); -- strip any apostrophe markup
  1242. else
  1243. script=''; -- if not set, make sure script is an empty string
  1244. end
  1245. if is_set (title) and is_set (script) then
  1246. script = ' ' .. script; -- add a space before we concatenate
  1247. end
  1248. return title .. script; -- return the concatenation
  1249. end
  1250. --[[--------------------------< G E T _ C O I N S _ P A G E S >------------------------------------------------
  1251. Extract page numbers from external wikilinks in any of the |page=, |pages=, or |at= parameters for use in COinS.
  1252. ]]
  1253. local function get_coins_pages (pages)
  1254. local pattern;
  1255. if not is_set (pages) then return pages; end -- if no page numbers then we're done
  1256. while true do
  1257. pattern = pages:match("%[(%w*:?//[^ ]+%s+)[%w%d].*%]"); -- pattern is the opening bracket, the url and following space(s): "[url "
  1258. if nil == pattern then break; end -- no more urls
  1259. pattern = escape_lua_magic_chars (pattern); -- pattern is not a literal string; escape lua's magic pattern characters
  1260. pages = pages:gsub(pattern, ""); -- remove as many instances of pattern as possible
  1261. end
  1262. pages = pages:gsub("[%[%]]", ""); -- remove the brackets
  1263. pages = pages:gsub("–", "-" ); -- replace endashes with hyphens
  1264. pages = pages:gsub("&%w+;", "-" ); -- and replace html entities (&ndash; etc.) with hyphens; do we need to replace numerical entities like &#32; and the like?
  1265. return pages;
  1266. end
  1267. -- Gets the display text for a wikilink like [[A|B]] or [[B]] gives B
  1268. local function remove_wiki_link( str )
  1269. return (str:gsub( "%[%[([^%[%]]*)%]%]", function(l)
  1270. return l:gsub( "^[^|]*|(.*)$", "%1" ):gsub("^%s*(.-)%s*$", "%1");
  1271. end));
  1272. end
  1273. -- Converts a hyphen to a dash
  1274. local function hyphen_to_dash( str )
  1275. if not is_set(str) or str:match( "[%[%]{}<>]" ) ~= nil then
  1276. return str;
  1277. end
  1278. return str:gsub( '-', '–' );
  1279. end
  1280. --[[--------------------------< S A F E _ J O I N >------------------------------------------------------------
  1281. Joins a sequence of strings together while checking for duplicate separation characters.
  1282. ]]
  1283. local function safe_join( tbl, duplicate_char )
  1284. --[[
  1285. Note: we use string functions here, rather than ustring functions.
  1286. This has considerably faster performance and should work correctly as
  1287. long as the duplicate_char is strict ASCII. The strings
  1288. in tbl may be ASCII or UTF8.
  1289. ]]
  1290. local str = ''; -- the output string
  1291. local comp = ''; -- what does 'comp' mean?
  1292. local end_chr = '';
  1293. local trim;
  1294. for _, value in ipairs( tbl ) do
  1295. if value == nil then value = ''; end
  1296. if str == '' then -- if output string is empty
  1297. str = value; -- assign value to it (first time through the loop)
  1298. elseif value ~= '' then
  1299. if value:sub(1,1) == '<' then -- Special case of values enclosed in spans and other markup.
  1300. comp = value:gsub( "%b<>", "" ); -- remove html markup (<span>string</span> -> string)
  1301. else
  1302. comp = value;
  1303. end
  1304. -- typically duplicate_char is sepc
  1305. if comp:sub(1,1) == duplicate_char then -- is first charactier same as duplicate_char? why test first character?
  1306. -- Because individual string segments often (always?) begin with terminal punct for th
  1307. -- preceding segment: 'First element' .. 'sepc next element' .. etc?
  1308. trim = false;
  1309. end_chr = str:sub(-1,-1); -- get the last character of the output string
  1310. -- str = str .. "<HERE(enchr=" .. end_chr.. ")" -- debug stuff?
  1311. if end_chr == duplicate_char then -- if same as separator
  1312. str = str:sub(1,-2); -- remove it
  1313. elseif end_chr == "'" then -- if it might be wikimarkup
  1314. if str:sub(-3,-1) == duplicate_char .. "''" then -- if last three chars of str are sepc''
  1315. str = str:sub(1, -4) .. "''"; -- remove them and add back ''
  1316. elseif str:sub(-5,-1) == duplicate_char .. "]]''" then -- if last five chars of str are sepc]]''
  1317. trim = true; -- why? why do this and next differently from previous?
  1318. elseif str:sub(-4,-1) == duplicate_char .. "]''" then -- if last four chars of str are sepc]''
  1319. trim = true; -- same question
  1320. end
  1321. elseif end_chr == "]" then -- if it might be wikimarkup
  1322. if str:sub(-3,-1) == duplicate_char .. "]]" then -- if last three chars of str are sepc]] wikilink
  1323. trim = true;
  1324. elseif str:sub(-2,-1) == duplicate_char .. "]" then -- if last two chars of str are sepc] external link
  1325. trim = true;
  1326. elseif str:sub(-4,-1) == duplicate_char .. "'']" then -- normal case when |url=something & |title=Title.
  1327. trim = true;
  1328. end
  1329. elseif end_chr == " " then -- if last char of output string is a space
  1330. if str:sub(-2,-1) == duplicate_char .. " " then -- if last two chars of str are <sepc><space>
  1331. str = str:sub(1,-3); -- remove them both
  1332. end
  1333. end
  1334. if trim then
  1335. if value ~= comp then -- value does not equal comp when value contains html markup
  1336. local dup2 = duplicate_char;
  1337. if dup2:match( "%A" ) then dup2 = "%" .. dup2; end -- if duplicate_char not a letter then escape it
  1338. value = value:gsub( "(%b<>)" .. dup2, "%1", 1 ) -- remove duplicate_char if it follows html markup
  1339. else
  1340. value = value:sub( 2, -1 ); -- remove duplicate_char when it is first character
  1341. end
  1342. end
  1343. end
  1344. str = str .. value; --add it to the output string
  1345. end
  1346. end
  1347. return str;
  1348. end
  1349. --[[--------------------------< I S _ G O O D _ V A N C _ N A M E >--------------------------------------------
  1350. For Vancouver Style, author/editor names are supposed to be rendered in Latin (read ASCII) characters. When a name
  1351. uses characters that contain diacritical marks, those characters are to converted to the corresponding Latin character.
  1352. When a name is written using a non-Latin alphabet or logogram, that name is to be transliterated into Latin characters.
  1353. These things are not currently possible in this module so are left to the editor to do.
  1354. This test allows |first= and |last= names to contain any of the letters defined in the four Unicode Latin character sets
  1355. [http://www.unicode.org/charts/PDF/U0000.pdf C0 Controls and Basic Latin] 0041–005A, 0061–007A
  1356. [http://www.unicode.org/charts/PDF/U0080.pdf C1 Controls and Latin-1 Supplement] 00C0–00D6, 00D8–00F6, 00F8–00FF
  1357. [http://www.unicode.org/charts/PDF/U0100.pdf Latin Extended-A] 0100–017F
  1358. [http://www.unicode.org/charts/PDF/U0180.pdf Latin Extended-B] 0180–01BF, 01C4–024F
  1359. |lastn= also allowed to contain hyphens, spaces, and apostrophes. (http://www.ncbi.nlm.nih.gov/books/NBK7271/box/A35029/)
  1360. |firstn= also allowed to contain hyphens, spaces, apostrophes, and periods
  1361. At the time of this writing, I had to write the 'if nil == mw.ustring.find ...' test ouside of the code editor and paste it here
  1362. because the code editor gets confused between character insertion point and cursor position.
  1363. ]]
  1364. local function is_good_vanc_name (last, first)
  1365. if nil == mw.ustring.find (last, "^[A-Za-zÀ-ÖØ-öø-ƿDŽ-ɏ%-%s%']*$") or nil == mw.ustring.find (first, "^[A-Za-zÀ-ÖØ-öø-ƿDŽ-ɏ%-%s%'%.]*$") then
  1366. add_vanc_error ();
  1367. return false; -- not a string of latin characters; Vancouver required Romanization
  1368. end;
  1369. return true;
  1370. end
  1371. --[[--------------------------< R E D U C E _ T O _ I N I T I A L S >------------------------------------------
  1372. Attempts to convert names to initials in support of |name-list-format=vanc.
  1373. Names in |firstn= may be separated by spaces or hyphens, or for initials, a period. See http://www.ncbi.nlm.nih.gov/books/NBK7271/box/A35062/.
  1374. Vancouver style requires family rank designations (Jr, II, III, etc) to be rendered as Jr, 2nd, 3rd, etc. This form is not
  1375. currently supported by this code so correctly formed names like Smith JL 2nd are converted to Smith J2. See http://www.ncbi.nlm.nih.gov/books/NBK7271/box/A35085/.
  1376. This function uses ustring functions because firstname initials may be any of the unicode Latin characters accepted by is_good_vanc_name ().
  1377. ]]
  1378. local function reduce_to_initials(first)
  1379. if mw.ustring.match(first, "^%u%u$") then return first end; -- when first contains just two upper-case letters, nothing to do
  1380. local initials = {}
  1381. local i = 0; -- counter for number of initials
  1382. for word in mw.ustring.gmatch(first, "[^%s%.%-]+") do -- names separated by spaces, hyphens, or periods
  1383. table.insert(initials, mw.ustring.sub(word,1,1)) -- Vancouver format does not include full stops.
  1384. i = i + 1; -- bump the counter
  1385. if 2 <= i then break; end -- only two initials allowed in Vancouver system; if 2, quit
  1386. end
  1387. return table.concat(initials) -- Vancouver format does not include spaces.
  1388. end
  1389. --[[--------------------------< L I S T _ P E O P L E >-------------------------------------------------------
  1390. Formats a list of people (e.g. authors / editors)
  1391. ]]
  1392. local function list_people(control, people, etal, list_name) -- TODO: why is list_name here? not used in this function
  1393. local sep;
  1394. local namesep;
  1395. local format = control.format
  1396. local maximum = control.maximum
  1397. local lastauthoramp = control.lastauthoramp;
  1398. local text = {}
  1399. if 'vanc' == format then -- Vancouver-like author/editor name styling?
  1400. sep = ','; -- name-list separator between authors is a comma
  1401. namesep = ' '; -- last/first separator is a space
  1402. else
  1403. sep = ';' -- name-list separator between authors is a semicolon
  1404. namesep = ', ' -- last/first separator is <comma><space>
  1405. end
  1406. if sep:sub(-1,-1) ~= " " then sep = sep .. " " end
  1407. if is_set (maximum) and maximum < 1 then return "", 0; end -- returned 0 is for EditorCount; not used for authors
  1408. for i,person in ipairs(people) do
  1409. if is_set(person.last) then
  1410. local mask = person.mask
  1411. local one
  1412. local sep_one = sep;
  1413. if is_set (maximum) and i > maximum then
  1414. etal = true;
  1415. break;
  1416. elseif (mask ~= nil) then
  1417. local n = tonumber(mask)
  1418. if (n ~= nil) then
  1419. one = string.rep("&mdash;",n)
  1420. else
  1421. one = mask;
  1422. sep_one = " ";
  1423. end
  1424. else
  1425. one = person.last
  1426. local first = person.first
  1427. if is_set(first) then
  1428. if ( "vanc" == format ) then -- if vancouver format
  1429. one = one:gsub ('%.', ''); -- remove periods from surnames (http://www.ncbi.nlm.nih.gov/books/NBK7271/box/A35029/)
  1430. if not person.corporate and is_good_vanc_name (one, first) then -- and name is all Latin characters; corporate authors not tested
  1431. first = reduce_to_initials(first) -- attempt to convert first name(s) to initials
  1432. end
  1433. end
  1434. one = one .. namesep .. first
  1435. end
  1436. if is_set(person.link) and person.link ~= control.page_name then
  1437. one = "[[" .. person.link .. "|" .. one .. "]]" -- link author/editor if this page is not the author's/editor's page
  1438. end
  1439. end
  1440. table.insert( text, one )
  1441. table.insert( text, sep_one )
  1442. end
  1443. end
  1444. local count = #text / 2; -- (number of names + number of separators) divided by 2
  1445. if count > 0 then
  1446. if count > 1 and is_set(lastauthoramp) and not etal then
  1447. text[#text-2] = " & "; -- replace last separator with ampersand text
  1448. end
  1449. text[#text] = nil; -- erase the last separator
  1450. end
  1451. local result = table.concat(text) -- construct list
  1452. if etal and is_set (result) then -- etal may be set by |display-authors=etal but we might not have a last-first list
  1453. result = result .. sep .. ' ' .. cfg.messages['et al']; -- we've go a last-first list and etal so add et al.
  1454. end
  1455. return result, count
  1456. end
  1457. --[[--------------------------< A N C H O R _ I D >------------------------------------------------------------
  1458. Generates a CITEREF anchor ID if we have at least one name or a date. Otherwise returns an empty string.
  1459. namelist is one of the contributor-, author-, or editor-name lists chosen in that order. year is Year or anchor_year.
  1460. ]]
  1461. local function anchor_id (namelist, year)
  1462. local names={}; -- a table for the one to four names and year
  1463. for i,v in ipairs (namelist) do -- loop through the list and take up to the first four last names
  1464. names[i] = v.last
  1465. if i == 4 then break end -- if four then done
  1466. end
  1467. table.insert (names, year); -- add the year at the end
  1468. local id = table.concat(names); -- concatenate names and year for CITEREF id
  1469. if is_set (id) then -- if concatenation is not an empty string
  1470. return "CITEREF" .. id; -- add the CITEREF portion
  1471. else
  1472. return ''; -- return an empty string; no reason to include CITEREF id in this citation
  1473. end
  1474. end
  1475. --[[--------------------------< N A M E _ H A S _ E T A L >----------------------------------------------------
  1476. Evaluates the content of author and editor name parameters for variations on the theme of et al. If found,
  1477. the et al. is removed, a flag is set to true and the function returns the modified name and the flag.
  1478. This function never sets the flag to false but returns it's previous state because it may have been set by
  1479. previous passes through this function or by the parameters |display-authors=etal or |display-editors=etal
  1480. ]]
  1481. local function name_has_etal (name, etal, nocat)
  1482. if is_set (name) then -- name can be nil in which case just return
  1483. local etal_pattern = "[;,]? *[\"']*%f[%a][Ee][Tt] *[Aa][Ll][%.\"']*$" -- variations on the 'et al' theme
  1484. local others_pattern = "[;,]? *%f[%a]and [Oo]thers"; -- and alternate to et al.
  1485. if name:match (etal_pattern) then -- variants on et al.
  1486. name = name:gsub (etal_pattern, ''); -- if found, remove
  1487. etal = true; -- set flag (may have been set previously here or by |display-authors=etal)
  1488. if not nocat then -- no categorization for |vauthors=
  1489. add_maint_cat ('etal'); -- and add a category if not already added
  1490. end
  1491. elseif name:match (others_pattern) then -- if not 'et al.', then 'and others'?
  1492. name = name:gsub (others_pattern, ''); -- if found, remove
  1493. etal = true; -- set flag (may have been set previously here or by |display-authors=etal)
  1494. if not nocat then -- no categorization for |vauthors=
  1495. add_maint_cat ('etal'); -- and add a category if not already added
  1496. end
  1497. end
  1498. end
  1499. return name, etal; --
  1500. end
  1501. --[[--------------------------< E X T R A C T _ N A M E S >----------------------------------------------------
  1502. Gets name list from the input arguments
  1503. Searches through args in sequential order to find |lastn= and |firstn= parameters (or their aliases), and their matching link and mask parameters.
  1504. Stops searching when both |lastn= and |firstn= are not found in args after two sequential attempts: found |last1=, |last2=, and |last3= but doesn't
  1505. find |last4= and |last5= then the search is done.
  1506. This function emits an error message when there is a |firstn= without a matching |lastn=. When there are 'holes' in the list of last names, |last1= and |last3=
  1507. are present but |last2= is missing, an error message is emitted. |lastn= is not required to have a matching |firstn=.
  1508. When an author or editor parameter contains some form of 'et al.', the 'et al.' is stripped from the parameter and a flag (etal) returned
  1509. that will cause list_people() to add the static 'et al.' text from Module:Citation/CS1/Configuration. This keeps 'et al.' out of the
  1510. template's metadata. When this occurs, the page is added to a maintenance category.
  1511. ]]
  1512. local function extract_names(args, list_name)
  1513. local names = {}; -- table of names
  1514. local last; -- individual name components
  1515. local first;
  1516. local link;
  1517. local mask;
  1518. local i = 1; -- loop counter/indexer
  1519. local n = 1; -- output table indexer
  1520. local count = 0; -- used to count the number of times we haven't found a |last= (or alias for authors, |editor-last or alias for editors)
  1521. local etal=false; -- return value set to true when we find some form of et al. in an author parameter
  1522. local err_msg_list_name = list_name:match ("(%w+)List") .. 's list'; -- modify AuthorList or EditorList for use in error messages if necessary
  1523. while true do
  1524. last = select_one( args, cfg.aliases[list_name .. '-Last'], 'redundant_parameters', i ); -- search through args for name components beginning at 1
  1525. first = select_one( args, cfg.aliases[list_name .. '-First'], 'redundant_parameters', i );
  1526. link = select_one( args, cfg.aliases[list_name .. '-Link'], 'redundant_parameters', i );
  1527. mask = select_one( args, cfg.aliases[list_name .. '-Mask'], 'redundant_parameters', i );
  1528. last, etal = name_has_etal (last, etal, false); -- find and remove variations on et al.
  1529. first, etal = name_has_etal (first, etal, false); -- find and remove variations on et al.
  1530. if first and not last then -- if there is a firstn without a matching lastn
  1531. table.insert( z.message_tail, { set_error( 'first_missing_last', {err_msg_list_name, i}, true ) } ); -- add this error message
  1532. elseif not first and not last then -- if both firstn and lastn aren't found, are we done?
  1533. count = count + 1; -- number of times we haven't found last and first
  1534. if 2 <= count then -- two missing names and we give up
  1535. break; -- normal exit or there is a two-name hole in the list; can't tell which
  1536. end
  1537. else -- we have last with or without a first
  1538. if is_set (link) and false == link_param_ok (link) then -- do this test here in case link is missing last
  1539. table.insert( z.message_tail, { set_error( 'bad_paramlink', list_name:match ("(%w+)List"):lower() .. '-link' .. i )}); -- url or wikilink in author link;
  1540. end
  1541. names[n] = {last = last, first = first, link = link, mask = mask, corporate=false}; -- add this name to our names list (corporate for |vauthors= only)
  1542. n = n + 1; -- point to next location in the names table
  1543. if 1 == count then -- if the previous name was missing
  1544. table.insert( z.message_tail, { set_error( 'missing_name', {err_msg_list_name, i-1}, true ) } ); -- add this error message
  1545. end
  1546. count = 0; -- reset the counter, we're looking for two consecutive missing names
  1547. end
  1548. i = i + 1; -- point to next args location
  1549. end
  1550. return names, etal; -- all done, return our list of names
  1551. end
  1552. --[[--------------------------< B U I L D _ I D _ L I S T >--------------------------------------------------------
  1553. Populates ID table from arguments using configuration settings. Loops through cfg.id_handlers and searches args for
  1554. any of the parameters listed in each cfg.id_handlers['...'].parameters. If found, adds the parameter and value to
  1555. the identifier list. Emits redundant error message is more than one alias exists in args
  1556. ]]
  1557. local function extract_ids( args )
  1558. local id_list = {}; -- list of identifiers found in args
  1559. for k, v in pairs( cfg.id_handlers ) do -- k is uc identifier name as index to cfg.id_handlers; e.g. cfg.id_handlers['ISBN'], v is a table
  1560. v = select_one( args, v.parameters, 'redundant_parameters' ); -- v.parameters is a table of aliases for k; here we pick one from args if present
  1561. if is_set(v) then id_list[k] = v; end -- if found in args, add identifier to our list
  1562. end
  1563. return id_list;
  1564. end
  1565. --[[--------------------------< B U I L D _ I D _ L I S T >--------------------------------------------------------
  1566. Takes a table of IDs created by extract_ids() and turns it into a table of formatted ID outputs.
  1567. inputs:
  1568. id_list – table of identifiers built by extract_ids()
  1569. options – table of various template parameter values used to modify some manually handled identifiers
  1570. ]]
  1571. local function build_id_list( id_list, options )
  1572. local new_list, handler = {};
  1573. function fallback(k) return { __index = function(t,i) return cfg.id_handlers[k][i] end } end;
  1574. for k, v in pairs( id_list ) do -- k is uc identifier name as index to cfg.id_handlers; e.g. cfg.id_handlers['ISBN'], v is a table
  1575. -- fallback to read-only cfg
  1576. handler = setmetatable( { ['id'] = v }, fallback(k) );
  1577. if handler.mode == 'external' then
  1578. table.insert( new_list, {handler.label, external_link_id( handler ) } );
  1579. elseif handler.mode == 'internal' then
  1580. table.insert( new_list, {handler.label, internal_link_id( handler ) } );
  1581. elseif handler.mode ~= 'manual' then
  1582. error( cfg.messages['unknown_ID_mode'] );
  1583. elseif k == 'DOI' then
  1584. table.insert( new_list, {handler.label, doi( v, options.DoiBroken ) } );
  1585. elseif k == 'HDL' then
  1586. table.insert( new_list, {handler.label, hdl( v ) } );
  1587. elseif k == 'ARXIV' then
  1588. table.insert( new_list, {handler.label, arxiv( v, options.Class ) } );
  1589. elseif k == 'ASIN' then
  1590. table.insert( new_list, {handler.label, amazon( v, options.ASINTLD ) } );
  1591. elseif k == 'LCCN' then
  1592. table.insert( new_list, {handler.label, lccn( v ) } );
  1593. elseif k == 'OL' or k == 'OLA' then
  1594. table.insert( new_list, {handler.label, openlibrary( v ) } );
  1595. elseif k == 'PMC' then
  1596. table.insert( new_list, {handler.label, pmc( v, options.Embargo ) } );
  1597. elseif k == 'PMID' then
  1598. table.insert( new_list, {handler.label, pmid( v ) } );
  1599. elseif k == 'ISMN' then
  1600. table.insert( new_list, {handler.label, ismn( v ) } );
  1601. elseif k == 'ISSN' then
  1602. table.insert( new_list, {handler.label, issn( v ) } );
  1603. elseif k == 'EISSN' then
  1604. table.insert( new_list, {handler.label, issn( v, true ) } ); -- true distinguishes eissn from issn
  1605. elseif k == 'ISBN' then
  1606. local ISBN = internal_link_id( handler );
  1607. if not check_isbn( v ) and not is_set(options.IgnoreISBN) then
  1608. ISBN = ISBN .. set_error( 'bad_isbn', {}, false, " ", "" );
  1609. end
  1610. table.insert( new_list, {handler.label, ISBN } );
  1611. elseif k == 'USENETID' then
  1612. table.insert( new_list, {handler.label, message_id( v ) } );
  1613. else
  1614. error( cfg.messages['unknown_manual_ID'] );
  1615. end
  1616. end
  1617. function comp( a, b ) -- used in following table.sort()
  1618. return a[1] < b[1];
  1619. end
  1620. table.sort( new_list, comp );
  1621. for k, v in ipairs( new_list ) do
  1622. new_list[k] = v[2];
  1623. end
  1624. return new_list;
  1625. end
  1626. --[[--------------------------< C O I N S _ C L E A N U P >----------------------------------------------------
  1627. Cleanup parameter values for the metadata by removing or replacing invisible characters and certain html entities.
  1628. 2015-12-10: there is a bug in mw.text.unstripNoWiki (). It replaced math stripmarkers with the appropriate content
  1629. when it shouldn't. See https://phabricator.wikimedia.org/T121085 and Wikipedia_talk:Lua#stripmarkers_and_mw.text.unstripNoWiki.28.29
  1630. TODO: move the replacement patterns and replacement values into a table in /Configuration similar to the invisible
  1631. characters table?
  1632. ]]
  1633. local function coins_cleanup (value)
  1634. value = mw.text.unstripNoWiki (value); -- replace nowiki stripmarkers with their content
  1635. value = value:gsub ('<span class="nowrap" style="padding%-left:0%.1em;">&#39;s</span>', "'s"); -- replace {{'s}} template with simple apostrophe-s
  1636. value = value:gsub ('&zwj;\226\128\138\039\226\128\139', "'"); -- replace {{'}} with simple apostrophe
  1637. value = value:gsub ('\226\128\138\039\226\128\139', "'"); -- replace {{'}} with simple apostrophe (as of 2015-12-11)
  1638. value = value:gsub ('&nbsp;', ' '); -- replace &nbsp; entity with plain space
  1639. value = value:gsub ('\226\128\138', ' '); -- replace hair space with plain space
  1640. value = value:gsub ('&zwj;', ''); -- remove &zwj; entities
  1641. value = value:gsub ('[\226\128\141\226\128\139]', '') -- remove zero-width joiner, zero-width space
  1642. value = value:gsub ('[\194\173\009\010\013]', ' '); -- replace soft hyphen, horizontal tab, line feed, carriage return with plain space
  1643. return value;
  1644. end
  1645. --[[--------------------------< C O I N S >--------------------------------------------------------------------
  1646. COinS metadata (see <http://ocoins.info/>) allows automated tools to parse the citation information.
  1647. ]]
  1648. local function COinS(data, class)
  1649. if 'table' ~= type(data) or nil == next(data) then
  1650. return '';
  1651. end
  1652. for k, v in pairs (data) do -- spin through all of the metadata parameter values
  1653. if 'ID_list' ~= k and 'Authors' ~= k then -- except the ID_list and Author tables (author nowiki stripmarker done when Author table processed)
  1654. data[k] = coins_cleanup (v);
  1655. end
  1656. end
  1657. local ctx_ver = "Z39.88-2004";
  1658. -- treat table strictly as an array with only set values.
  1659. local OCinSoutput = setmetatable( {}, {
  1660. __newindex = function(self, key, value)
  1661. if is_set(value) then
  1662. rawset( self, #self+1, table.concat{ key, '=', mw.uri.encode( remove_wiki_link( value ) ) } );
  1663. end
  1664. end
  1665. });
  1666. if in_array (class, {'arxiv', 'journal', 'news'}) or (in_array (class, {'conference', 'interview', 'map', 'press release', 'web'}) and is_set(data.Periodical)) or
  1667. ('citation' == class and is_set(data.Periodical) and not is_set (data.Encyclopedia)) then
  1668. OCinSoutput.rft_val_fmt = "info:ofi/fmt:kev:mtx:journal"; -- journal metadata identifier
  1669. if 'arxiv' == class then -- set genre according to the type of citation template we are rendering
  1670. OCinSoutput["rft.genre"] = "preprint"; -- cite arxiv
  1671. elseif 'conference' == class then
  1672. OCinSoutput["rft.genre"] = "conference"; -- cite conference (when Periodical set)
  1673. elseif 'web' == class then
  1674. OCinSoutput["rft.genre"] = "unknown"; -- cite web (when Periodical set)
  1675. else
  1676. OCinSoutput["rft.genre"] = "article"; -- journal and other 'periodical' articles
  1677. end
  1678. OCinSoutput["rft.jtitle"] = data.Periodical; -- journal only
  1679. if is_set (data.Map) then
  1680. OCinSoutput["rft.atitle"] = data.Map; -- for a map in a periodical
  1681. else
  1682. OCinSoutput["rft.atitle"] = data.Title; -- all other 'periodical' article titles
  1683. end
  1684. -- these used onlu for periodicals
  1685. OCinSoutput["rft.ssn"] = data.Season; -- keywords: winter, spring, summer, fall
  1686. OCinSoutput["rft.chron"] = data.Chron; -- free-form date components
  1687. OCinSoutput["rft.volume"] = data.Volume; -- does not apply to books
  1688. OCinSoutput["rft.issue"] = data.Issue;
  1689. OCinSoutput["rft.pages"] = data.Pages; -- also used in book metadata
  1690. elseif 'thesis' ~= class then -- all others except cite thesis are treated as 'book' metadata; genre distinguishes
  1691. OCinSoutput.rft_val_fmt = "info:ofi/fmt:kev:mtx:book"; -- book metadata identifier
  1692. if 'report' == class or 'techreport' == class then -- cite report and cite techreport
  1693. OCinSoutput["rft.genre"] = "report";
  1694. elseif 'conference' == class then -- cite conference when Periodical not set
  1695. OCinSoutput["rft.genre"] = "conference";
  1696. elseif in_array (class, {'book', 'citation', 'encyclopaedia', 'interview', 'map'}) then
  1697. if is_set (data.Chapter) then
  1698. OCinSoutput["rft.genre"] = "bookitem";
  1699. OCinSoutput["rft.atitle"] = data.Chapter; -- book chapter, encyclopedia article, interview in a book, or map title
  1700. else
  1701. if 'map' == class or 'interview' == class then
  1702. OCinSoutput["rft.genre"] = 'unknown'; -- standalone map or interview
  1703. else
  1704. OCinSoutput["rft.genre"] = 'book'; -- book and encyclopedia
  1705. end
  1706. end
  1707. else --{'audio-visual', 'AV-media-notes', 'DVD-notes', 'episode', 'interview', 'mailinglist', 'map', 'newsgroup', 'podcast', 'press release', 'serial', 'sign', 'speech', 'web'}
  1708. OCinSoutput["rft.genre"] = "unknown";
  1709. end
  1710. OCinSoutput["rft.btitle"] = data.Title; -- book only
  1711. OCinSoutput["rft.place"] = data.PublicationPlace; -- book only
  1712. OCinSoutput["rft.series"] = data.Series; -- book only
  1713. OCinSoutput["rft.pages"] = data.Pages; -- book, journal
  1714. OCinSoutput["rft.edition"] = data.Edition; -- book only
  1715. OCinSoutput["rft.pub"] = data.PublisherName; -- book and dissertation
  1716. else -- cite thesis
  1717. OCinSoutput.rft_val_fmt = "info:ofi/fmt:kev:mtx:dissertation"; -- dissertation metadata identifier
  1718. OCinSoutput["rft.title"] = data.Title; -- dissertation (also patent but that is not yet supported)
  1719. OCinSoutput["rft.degree"] = data.Degree; -- dissertation only
  1720. OCinSoutput['rft.inst'] = data.PublisherName; -- book and dissertation
  1721. end
  1722. -- and now common parameters (as much as possible)
  1723. OCinSoutput["rft.date"] = data.Date; -- book, journal, dissertation
  1724. for k, v in pairs( data.ID_list ) do -- what to do about these? For now assume that they are common to all?
  1725. if k == 'ISBN' then v = clean_isbn( v ) end
  1726. local id = cfg.id_handlers[k].COinS;
  1727. if string.sub( id or "", 1, 4 ) == 'info' then -- for ids that are in the info:registry
  1728. OCinSoutput["rft_id"] = table.concat{ id, "/", v };
  1729. elseif string.sub (id or "", 1, 3 ) == 'rft' then -- for isbn, issn, eissn, etc that have defined COinS keywords
  1730. OCinSoutput[ id ] = v;
  1731. elseif id then -- when cfg.id_handlers[k].COinS is not nil
  1732. OCinSoutput["rft_id"] = table.concat{ cfg.id_handlers[k].prefix, v }; -- others; provide a url
  1733. end
  1734. end
  1735. --[[
  1736. for k, v in pairs( data.ID_list ) do -- what to do about these? For now assume that they are common to all?
  1737. local id, value = cfg.id_handlers[k].COinS;
  1738. if k == 'ISBN' then value = clean_isbn( v ); else value = v; end
  1739. if string.sub( id or "", 1, 4 ) == 'info' then
  1740. OCinSoutput["rft_id"] = table.concat{ id, "/", v };
  1741. else
  1742. OCinSoutput[ id ] = value;
  1743. end
  1744. end
  1745. ]]
  1746. local last, first;
  1747. for k, v in ipairs( data.Authors ) do
  1748. last, first = coins_cleanup (v.last), coins_cleanup (v.first or ''); -- replace any nowiki strip markers, non-printing or invisible characers
  1749. if k == 1 then -- for the first author name only
  1750. if is_set(last) and is_set(first) then -- set these COinS values if |first= and |last= specify the first author name
  1751. OCinSoutput["rft.aulast"] = last; -- book, journal, dissertation
  1752. OCinSoutput["rft.aufirst"] = first; -- book, journal, dissertation
  1753. elseif is_set(last) then
  1754. OCinSoutput["rft.au"] = last; -- book, journal, dissertation -- otherwise use this form for the first name
  1755. end
  1756. else -- for all other authors
  1757. if is_set(last) and is_set(first) then
  1758. OCinSoutput["rft.au"] = table.concat{ last, ", ", first }; -- book, journal, dissertation
  1759. elseif is_set(last) then
  1760. OCinSoutput["rft.au"] = last; -- book, journal, dissertation
  1761. end
  1762. end
  1763. end
  1764. OCinSoutput.rft_id = data.URL;
  1765. OCinSoutput.rfr_id = table.concat{ "info:sid/", mw.site.server:match( "[^/]*$" ), ":", data.RawPage };
  1766. OCinSoutput = setmetatable( OCinSoutput, nil );
  1767. -- sort with version string always first, and combine.
  1768. table.sort( OCinSoutput );
  1769. table.insert( OCinSoutput, 1, "ctx_ver=" .. ctx_ver ); -- such as "Z39.88-2004"
  1770. return table.concat(OCinSoutput, "&");
  1771. end
  1772. --[[--------------------------< G E T _ I S O 6 3 9 _ C O D E >------------------------------------------------
  1773. Validates language names provided in |language= parameter if not an ISO639-1 code. Handles the special case that is Norwegian where
  1774. ISO639-1 code 'no' is mapped to language name 'Norwegian Bokmål' by Extention:CLDR.
  1775. Returns the language name and associated ISO639-1 code. Because case of the source may be incorrect or different from the case that Wikimedia
  1776. uses, the name comparisons are done in lower case and when a match is found, the Wikimedia version (assumed to be correct) is returned along
  1777. with the code. When there is no match, we return the original language name string.
  1778. mw.language.fetchLanguageNames() will return a list of languages that aren't part of ISO639-1. Names that aren't ISO639-1 but that are included
  1779. in the list will be found if that name is provided in the |language= parameter. For example, if |language=Samaritan Aramaic, that name will be
  1780. found with the associated code 'sam', not an ISO639-1 code. When names are found and the associated code is not two characters, this function
  1781. returns only the Wikimedia language name.
  1782. Adapted from code taken from Module:Check ISO 639-1.
  1783. ]]
  1784. local function get_iso639_code (lang)
  1785. if 'norwegian' == lang:lower() then -- special case related to Wikimedia remap of code 'no' at Extension:CLDR
  1786. return 'Norwegian', 'no'; -- Make sure rendered version is properly capitalized
  1787. end
  1788. local languages = mw.language.fetchLanguageNames(mw.getContentLanguage():getCode(), 'all') -- get a list of language names known to Wikimedia
  1789. -- ('all' is required for North Ndebele, South Ndebele, and Ojibwa)
  1790. local langlc = mw.ustring.lower(lang); -- lower case version for comparisons
  1791. for code, name in pairs(languages) do -- scan the list to see if we can find our language
  1792. if langlc == mw.ustring.lower(name) then
  1793. if 2 ~= code:len() then -- ISO639-1 codes only
  1794. return name; -- so return the name but not the code
  1795. end
  1796. return name, code; -- found it, return name to ensure proper capitalization and the ISO639-1 code
  1797. end
  1798. end
  1799. return lang; -- not valid language; return language in original case and nil for ISO639-1 code
  1800. end
  1801. --[[--------------------------< L A N G U A G E _ P A R A M E T E R >------------------------------------------
  1802. Get language name from ISO639-1 code value provided. If a code is valid use the returned name; if not, then use the value that was provided with the language parameter.
  1803. There is an exception. There are three ISO639-1 codes for Norewegian language variants. There are two official variants: Norwegian Bokmål (code 'nb') and
  1804. Norwegian Nynorsk (code 'nn'). The third, code 'no', is defined by ISO639-1 as 'Norwegian' though in Norway this is pretty much meaningless. However, it appears
  1805. that on enwiki, editors are for the most part unaware of the nb and nn variants (compare page counts for these variants at Category:Articles with non-English-language external links.
  1806. Because Norwegian Bokmål is the most common language variant, Media wiki has been modified to return Norwegian Bokmål for ISO639-1 code 'no'. Here we undo that and
  1807. return 'Norwegian' when editors use |language=no. We presume that editors don't know about the variants or can't descriminate between them.
  1808. See Help talk:Citation Style_1#An ISO 639-1 language name test
  1809. When |language= contains a valid ISO639-1 code, the page is assigned to the category for that code: Category:Norwegian-language sources (no) if
  1810. the page is a mainspace page and the ISO639-1 code is not 'en'. Similarly, if the parameter is |language=Norwegian, it will be categorized in the same way.
  1811. This function supports multiple languages in the form |language=nb, French, th where the language names or codes are separated from each other by commas.
  1812. ]]
  1813. local function language_parameter (lang)
  1814. local code; -- the ISO639-1 two character code
  1815. local name; -- the language name
  1816. local language_list = {}; -- table of language names to be rendered
  1817. local names_table = {}; -- table made from the value assigned to |language=
  1818. names_table = mw.text.split (lang, '%s*,%s*'); -- names should be a comma separated list
  1819. for _, lang in ipairs (names_table) do -- reuse lang
  1820. if lang:match ('^%a%a%-') or 2 == lang:len() then -- ISO639-1 language code are 2 characters (fetchLanguageName also supports 3 character codes)
  1821. if lang:match ('^zh-') then
  1822. name = mw.language.fetchLanguageName( lang:lower(), lang:lower() );
  1823. else
  1824. name = mw.language.fetchLanguageName( lang:lower(), mw.getContentLanguage():getCode() ); -- get ISO 639-1 language name if Language is a proper code
  1825. end
  1826. end
  1827. if is_set (name) then -- if Language specified a valid ISO639-1 code
  1828. code = lang:lower(); -- save it
  1829. else
  1830. name, code = get_iso639_code (lang); -- attempt to get code from name (assign name here so that we are sure of proper capitalization)
  1831. end
  1832. if is_set (code) then
  1833. if 'no' == code then name = '挪威语' end; -- override wikimedia when code is 'no'
  1834. if 'zh' ~= code and not code:match ('^zh-') then -- English not the language
  1835. add_prop_cat ('foreign_lang_source', {name, code})
  1836. end
  1837. else
  1838. add_maint_cat ('unknown_lang'); -- add maint category if not already added
  1839. end
  1840. table.insert (language_list, name);
  1841. name = ''; -- so we can reuse it
  1842. end
  1843. code = #language_list -- reuse code as number of languages in the list
  1844. if 2 >= code then
  1845. name = table.concat (language_list, '及') -- insert '及' between two language names
  1846. elseif 2 < code then
  1847. language_list[code] = '及' .. language_list[code]; -- prepend last name with '及'
  1848. name = table.concat (language_list, '、'); -- and concatenate with '<comma><space>' separators
  1849. name = name:gsub ('、及', '及', 1);
  1850. end
  1851. return (" " .. wrap_msg ('language', name)); -- otherwise wrap with '(in ...)'
  1852. end
  1853. --[[--------------------------< S E T _ C S 1 _ S T Y L E >----------------------------------------------------
  1854. Set style settings for CS1 citation templates. Returns separator and postscript settings
  1855. ]]
  1856. local function set_cs1_style (ps)
  1857. if not is_set (ps) then -- unless explicitely set to something
  1858. ps = '.'; -- terminate the rendered citation with a period
  1859. end
  1860. return '.', ps; -- separator is a full stop
  1861. end
  1862. --[[--------------------------< S E T _ C S 2 _ S T Y L E >----------------------------------------------------
  1863. Set style settings for CS2 citation templates. Returns separator, postscript, ref settings
  1864. ]]
  1865. local function set_cs2_style (ps, ref)
  1866. if not is_set (ps) then -- if |postscript= has not been set, set cs2 default
  1867. ps = ''; -- make sure it isn't nil
  1868. end
  1869. if not is_set (ref) then -- if |ref= is not set
  1870. ref = "harv"; -- set default |ref=harv
  1871. end
  1872. return ',', ps, ref; -- separator is a comma
  1873. end
  1874. --[[--------------------------< G E T _ S E T T I N G S _ F R O M _ C I T E _ C L A S S >----------------------
  1875. When |mode= is not set or when its value is invalid, use config.CitationClass and parameter values to establish
  1876. rendered style.
  1877. ]]
  1878. local function get_settings_from_cite_class (ps, ref, cite_class)
  1879. local sep;
  1880. if (cite_class == "citation") then -- for citation templates (CS2)
  1881. sep, ps, ref = set_cs2_style (ps, ref);
  1882. else -- not a citation template so CS1
  1883. sep, ps = set_cs1_style (ps);
  1884. end
  1885. return sep, ps, ref -- return them all
  1886. end
  1887. --[[--------------------------< S E T _ S T Y L E >------------------------------------------------------------
  1888. Establish basic style settings to be used when rendering the citation. Uses |mode= if set and valid or uses
  1889. config.CitationClass from the template's #invoke: to establish style.
  1890. ]]
  1891. local function set_style (mode, ps, ref, cite_class)
  1892. local sep;
  1893. if 'cs2' == mode then -- if this template is to be rendered in CS2 (citation) style
  1894. sep, ps, ref = set_cs2_style (ps, ref);
  1895. elseif 'cs1' == mode then -- if this template is to be rendered in CS1 (cite xxx) style
  1896. sep, ps = set_cs1_style (ps);
  1897. else -- anything but cs1 or cs2
  1898. sep, ps, ref = get_settings_from_cite_class (ps, ref, cite_class); -- get settings based on the template's CitationClass
  1899. end
  1900. if 'none' == ps:lower() then -- if assigned value is 'none' then
  1901. ps = ''; -- set to empty string
  1902. end
  1903. return sep, ps, ref
  1904. end
  1905. --[=[-------------------------< I S _ P D F >------------------------------------------------------------------
  1906. Determines if a url has the file extension that is one of the pdf file extensions used by [[MediaWiki:Common.css]] when
  1907. applying the pdf icon to external links.
  1908. returns true if file extension is one of the recognized extension, else false
  1909. ]=]
  1910. local function is_pdf (url)
  1911. return url:match ('%.pdf[%?#]?') or url:match ('%.PDF[%?#]?');
  1912. end
  1913. --[[--------------------------< S T Y L E _ F O R M A T >------------------------------------------------------
  1914. Applies css style to |format=, |chapter-format=, etc. Also emits an error message if the format parameter does
  1915. not have a matching url parameter. If the format parameter is not set and the url contains a file extension that
  1916. is recognized as a pdf document by MediaWiki's commons.css, this code will set the format parameter to (PDF) with
  1917. the appropriate styling.
  1918. ]]
  1919. local function style_format (format, url, fmt_param, url_param)
  1920. if is_set (format) then
  1921. format = wrap_style ('format', format); -- add leading space, parenthases, resize
  1922. if not is_set (url) then
  1923. format = format .. set_error( 'format_missing_url', {fmt_param, url_param} ); -- add an error message
  1924. end
  1925. elseif is_pdf (url) then -- format is not set so if url is a pdf file then
  1926. format = wrap_style ('format', 'PDF'); -- set format to pdf
  1927. else
  1928. format = ''; -- empty string for concatenation
  1929. end
  1930. return format;
  1931. end
  1932. --[[--------------------------< G E T _ D I S P L A Y _ A U T H O R S _ E D I T O R S >------------------------
  1933. Returns a number that may or may not limit the length of the author or editor name lists.
  1934. When the value assigned to |display-authors= is a number greater than or equal to zero, return the number and
  1935. the previous state of the 'etal' flag (false by default but may have been set to true if the name list contains
  1936. some variant of the text 'et al.').
  1937. When the value assigned to |display-authors= is the keyword 'etal', return a number that is one greater than the
  1938. number of authors in the list and set the 'etal' flag true. This will cause the list_people() to display all of
  1939. the names in the name list followed by 'et al.'
  1940. In all other cases, returns nil and the previous state of the 'etal' flag.
  1941. ]]
  1942. local function get_display_authors_editors (max, count, list_name, etal)
  1943. if is_set (max) then
  1944. if 'etal' == max:lower():gsub("[ '%.]", '') then -- the :gsub() portion makes 'etal' from a variety of 'et al.' spellings and stylings
  1945. max = count + 1; -- number of authors + 1 so display all author name plus et al.
  1946. etal = true; -- overrides value set by extract_names()
  1947. elseif max:match ('^%d+$') then -- if is a string of numbers
  1948. max = tonumber (max); -- make it a number
  1949. if max >= count and 'authors' == list_name then -- AUTHORS ONLY -- if |display-xxxxors= value greater than or equal to number of authors/editors
  1950. add_maint_cat ('disp_auth_ed', list_name);
  1951. end
  1952. else -- not a valid keyword or number
  1953. table.insert( z.message_tail, { set_error( 'invalid_param_val', {'display-' .. list_name, max}, true ) } ); -- add error message
  1954. max = nil; -- unset
  1955. end
  1956. elseif 'authors' == list_name then -- AUTHORS ONLY need to clear implicit et al category
  1957. max = count + 1; -- number of authors + 1
  1958. end
  1959. return max, etal;
  1960. end
  1961. --[[--------------------------< E X T R A _ T E X T _ I N _ P A G E _ C H E C K >------------------------------
  1962. Adds page to Category:CS1 maint: extra text if |page= or |pages= has what appears to be some form of p. or pp.
  1963. abbreviation in the first characters of the parameter content.
  1964. check Page and Pages for extraneous p, p., pp, and pp. at start of parameter value:
  1965. good pattern: '^P[^%.P%l]' matches when |page(s)= begins PX or P# but not Px where x and X are letters and # is a dgiit
  1966. bad pattern: '^[Pp][Pp]' matches matches when |page(s)= begins pp or pP or Pp or PP
  1967. ]]
  1968. local function extra_text_in_page_check (page)
  1969. -- local good_pattern = '^P[^%.P%l]';
  1970. local good_pattern = '^P[^%.Pp]'; -- ok to begin with uppercase P: P7 (pg 7 of section P) but not p123 (page 123) TODO: add Gg for PG or Pg?
  1971. -- local bad_pattern = '^[Pp][Pp]';
  1972. local bad_pattern = '^[Pp]?[Pp]%.?[ %d]';
  1973. if not page:match (good_pattern) and (page:match (bad_pattern) or page:match ('^[Pp]ages?')) then
  1974. add_maint_cat ('extra_text');
  1975. end
  1976. -- if Page:match ('^[Pp]?[Pp]%.?[ %d]') or Page:match ('^[Pp]ages?[ %d]') or
  1977. -- Pages:match ('^[Pp]?[Pp]%.?[ %d]') or Pages:match ('^[Pp]ages?[ %d]') then
  1978. -- add_maint_cat ('extra_text');
  1979. -- end
  1980. end
  1981. --[[--------------------------< P A R S E _ V A U T H O R S _ V E D I T O R S >--------------------------------
  1982. This function extracts author / editor names from |vauthors= or |veditors= and finds matching |xxxxor-maskn= and
  1983. |xxxxor-linkn= in args. It then returns a table of assembled names just as extract_names() does.
  1984. Author / editor names in |vauthors= or |veditors= must be in Vancouver system style. Corporate or institutional names
  1985. may sometimes be required and because such names will often fail the is_good_vanc_name() and other format compliance
  1986. tests, are wrapped in doubled paranethese ((corporate name)) to suppress the format tests.
  1987. This function sets the vancouver error when a reqired comma is missing and when there is a space between an author's initials.
  1988. ]]
  1989. local function parse_vauthors_veditors (args, vparam, list_name)
  1990. local names = {}; -- table of names assembled from |vauthors=, |author-maskn=, |author-linkn=
  1991. local v_name_table = {};
  1992. local etal = false; -- return value set to true when we find some form of et al. vauthors parameter
  1993. local last, first, link, mask;
  1994. local corporate = false;
  1995. vparam, etal = name_has_etal (vparam, etal, true); -- find and remove variations on et al. do not categorize (do it here because et al. might have a period)
  1996. if vparam:find ('%[%[') or vparam:find ('%]%]') then -- no wikilinking vauthors names
  1997. add_vanc_error ();
  1998. end
  1999. v_name_table = mw.text.split(vparam, "%s*,%s*") -- names are separated by commas
  2000. for i, v_name in ipairs(v_name_table) do
  2001. if v_name:match ('^%(%(.+%)%)$') then -- corporate authors are wrapped in doubled parenthese to supress vanc formatting and error detection
  2002. first = ''; -- set to empty string for concatenation and because it may have been set for previous author/editor
  2003. last = v_name:match ('^%(%((.+)%)%)$')
  2004. corporate = true;
  2005. elseif string.find(v_name, "%s") then
  2006. lastfirstTable = {}
  2007. lastfirstTable = mw.text.split(v_name, "%s")
  2008. first = table.remove(lastfirstTable); -- removes and returns value of last element in table which should be author intials
  2009. last = table.concat(lastfirstTable, " ") -- returns a string that is the concatenation of all other names that are not initials
  2010. if mw.ustring.match (last, '%a+%s+%u+%s+%a+') or mw.ustring.match (v_name, ' %u %u$') then
  2011. add_vanc_error (); -- matches last II last; the case when a comma is missing or a space between two intiials
  2012. end
  2013. else
  2014. first = ''; -- set to empty string for concatenation and because it may have been set for previous author/editor
  2015. last = v_name; -- last name or single corporate name? Doesn't support multiword corporate names? do we need this?
  2016. end
  2017. if is_set (first) and not mw.ustring.match (first, "^%u?%u$") then -- first shall contain one or two upper-case letters, nothing else
  2018. add_vanc_error ();
  2019. end
  2020. -- this from extract_names ()
  2021. link = select_one( args, cfg.aliases[list_name .. '-Link'], 'redundant_parameters', i );
  2022. mask = select_one( args, cfg.aliases[list_name .. '-Mask'], 'redundant_parameters', i );
  2023. names[i] = {last = last, first = first, link = link, mask = mask, corporate=corporate}; -- add this assembled name to our names list
  2024. end
  2025. return names, etal; -- all done, return our list of names
  2026. end
  2027. --[[--------------------------< S E L E C T _ A U T H O R _ E D I T O R _ S O U R C E >------------------------
  2028. Select one of |authors=, |authorn= / |lastn / firstn=, or |vauthors= as the source of the author name list or
  2029. select one of |editors=, |editorn= / editor-lastn= / |editor-firstn= or |veditors= as the source of the editor name list.
  2030. Only one of these appropriate three will be used. The hierarchy is: |authorn= (and aliases) highest and |authors= lowest and
  2031. similarly, |editorn= (and aliases) highest and |editors= lowest
  2032. When looking for |authorn= / |editorn= parameters, test |xxxxor1= and |xxxxor2= (and all of their aliases); stops after the second
  2033. test which mimicks the test used in extract_names() when looking for a hole in the author name list. There may be a better
  2034. way to do this, I just haven't discovered what that way is.
  2035. Emits an error message when more than one xxxxor name source is provided.
  2036. In this function, vxxxxors = vauthors or veditors; xxxxors = authors or editors as appropriate.
  2037. ]]
  2038. local function select_author_editor_source (vxxxxors, xxxxors, args, list_name)
  2039. local lastfirst = false;
  2040. if select_one( args, cfg.aliases[list_name .. '-Last'], 'none', 1 ) or -- do this twice incase we have a first 1 without a last1
  2041. select_one( args, cfg.aliases[list_name .. '-Last'], 'none', 2 ) then
  2042. lastfirst=true;
  2043. end
  2044. if (is_set (vxxxxors) and true == lastfirst) or -- these are the three error conditions
  2045. (is_set (vxxxxors) and is_set (xxxxors)) or
  2046. (true == lastfirst and is_set (xxxxors)) then
  2047. local err_name;
  2048. if 'AuthorList' == list_name then -- figure out which name should be used in error message
  2049. err_name = 'author';
  2050. else
  2051. err_name = 'editor';
  2052. end
  2053. table.insert( z.message_tail, { set_error( 'redundant_parameters',
  2054. {err_name .. '-name-list parameters'}, true ) } ); -- add error message
  2055. end
  2056. if true == lastfirst then return 1 end; -- return a number indicating which author name source to use
  2057. if is_set (vxxxxors) then return 2 end;
  2058. if is_set (xxxxors) then return 3 end;
  2059. return 1; -- no authors so return 1; this allows missing author name test to run in case there is a first without last
  2060. end
  2061. --[[--------------------------< I S _ V A L I D _ P A R A M E T E R _ V A L U E >------------------------------
  2062. This function is used to validate a parameter's assigned value for those parameters that have only a limited number
  2063. of allowable values (yes, y, true, no, etc). When the parameter value has not been assigned a value (missing or empty
  2064. in the source template) the function refurns true. If the parameter value is one of the list of allowed values returns
  2065. true; else, emits an error message and returns false.
  2066. ]]
  2067. local function is_valid_parameter_value (value, name, possible)
  2068. if not is_set (value) then
  2069. return true; -- an empty parameter is ok
  2070. elseif in_array(value:lower(), possible) then
  2071. return true;
  2072. else
  2073. table.insert( z.message_tail, { set_error( 'invalid_param_val', {name, value}, true ) } ); -- not an allowed value so add error message
  2074. return false
  2075. end
  2076. end
  2077. --[[--------------------------< T E R M I N A T E _ N A M E _ L I S T >----------------------------------------
  2078. This function terminates a name list (author, contributor, editor) with a separator character (sepc) and a space
  2079. when the last character is not a sepc character or when the last three characters are not sepc followed by two
  2080. closing square brackets (close of a wikilink). When either of these is true, the name_list is terminated with a
  2081. single space character.
  2082. ]]
  2083. local function terminate_name_list (name_list, sepc)
  2084. if (string.sub (name_list,-1,-1) == sepc) or (string.sub (name_list,-3,-1) == sepc .. ']]') then -- if last name in list ends with sepc char
  2085. return name_list .. " "; -- don't add another
  2086. else
  2087. return name_list .. sepc .. ' '; -- otherwise terninate the name list
  2088. end
  2089. end
  2090. --[[-------------------------< F O R M A T _ V O L U M E _ I S S U E >----------------------------------------
  2091. returns the concatenation of the formatted volume and issue parameters as a single string; or formatted volume
  2092. or formatted issue, or an empty string if neither are set.
  2093. ]]
  2094. local function format_volume_issue (volume, issue, cite_class, origin, sepc, lower)
  2095. if not is_set (volume) and not is_set (issue) then
  2096. return '';
  2097. end
  2098. if 'magazine' == cite_class or (in_array (cite_class, {'citation', 'map'}) and 'magazine' == origin) then
  2099. if is_set (volume) and is_set (issue) then
  2100. return wrap_msg ('vol-no', {sepc, volume, issue}, lower);
  2101. elseif is_set (volume) then
  2102. return wrap_msg ('vol', {sepc, volume}, lower);
  2103. else
  2104. return wrap_msg ('issue', {sepc, issue}, lower);
  2105. end
  2106. end
  2107. local vol = '';
  2108. if is_set (volume) then
  2109. if (6 < mw.ustring.len(volume)) then
  2110. vol = substitute (cfg.messages['j-vol'], {sepc, volume});
  2111. else
  2112. vol = wrap_style ('vol-bold', hyphen_to_dash(volume));
  2113. end
  2114. end
  2115. if is_set (issue) then
  2116. return vol .. substitute (cfg.messages['j-issue'], issue);
  2117. end
  2118. return vol;
  2119. end
  2120. --[[-------------------------< F O R M A T _ P A G E S _ S H E E T S >-----------------------------------------
  2121. adds static text to one of |page(s)= or |sheet(s)= values and returns it with all of the others set to empty strings.
  2122. The return order is:
  2123. page, pages, sheet, sheets
  2124. Singular has priority over plural when both are provided.
  2125. ]]
  2126. local function format_pages_sheets (page, pages, sheet, sheets, cite_class, origin, sepc, nopp, lower)
  2127. if 'map' == cite_class then -- only cite map supports sheet(s) as in-source locators
  2128. if is_set (sheet) then
  2129. if 'journal' == origin then
  2130. return '', '', wrap_msg ('j-sheet', sheet, lower), '';
  2131. else
  2132. return '', '', wrap_msg ('sheet', {sepc, sheet}, lower), '';
  2133. end
  2134. elseif is_set (sheets) then
  2135. if 'journal' == origin then
  2136. return '', '', '', wrap_msg ('j-sheets', sheets, lower);
  2137. else
  2138. return '', '', '', wrap_msg ('sheets', {sepc, sheets}, lower);
  2139. end
  2140. end
  2141. end
  2142. local is_journal = 'journal' == cite_class or (in_array (cite_class, {'citation', 'map'}) and 'journal' == origin);
  2143. if is_set (page) then
  2144. if is_journal then
  2145. return substitute (cfg.messages['j-page(s)'], page), '', '', '';
  2146. elseif not nopp then
  2147. return substitute (cfg.messages['p-prefix'], {sepc, page}), '', '', '';
  2148. else
  2149. return substitute (cfg.messages['nopp'], {sepc, page}), '', '', '';
  2150. end
  2151. elseif is_set(pages) then
  2152. if is_journal then
  2153. return substitute (cfg.messages['j-page(s)'], pages), '', '', '';
  2154. elseif tonumber(pages) ~= nil and not nopp then -- if pages is only digits, assume a single page number
  2155. return '', substitute (cfg.messages['p-prefix'], {sepc, pages}), '', '';
  2156. elseif not nopp then
  2157. return '', substitute (cfg.messages['pp-prefix'], {sepc, pages}), '', '';
  2158. else
  2159. return '', substitute (cfg.messages['nopp'], {sepc, pages}), '', '';
  2160. end
  2161. end
  2162. return '', '', '', ''; -- return empty strings
  2163. end
  2164. --[[--------------------------< C I T A T I O N 0 >------------------------------------------------------------
  2165. This is the main function doing the majority of the citation formatting.
  2166. ]]
  2167. local function citation0( config, args)
  2168. --[[
  2169. Load Input Parameters
  2170. The argument_wrapper facilitates the mapping of multiple aliases to single internal variable.
  2171. ]]
  2172. local A = argument_wrapper( args );
  2173. local i
  2174. -- Pick out the relevant fields from the arguments. Different citation templates
  2175. -- define different field names for the same underlying things.
  2176. local author_etal;
  2177. local a = {}; -- authors list from |lastn= / |firstn= pairs or |vauthors=
  2178. local Authors;
  2179. local NameListFormat = A['NameListFormat'];
  2180. do -- to limit scope of selected
  2181. local selected = select_author_editor_source (A['Vauthors'], A['Authors'], args, 'AuthorList');
  2182. if 1 == selected then
  2183. a, author_etal = extract_names (args, 'AuthorList'); -- fetch author list from |authorn= / |lastn= / |firstn=, |author-linkn=, and |author-maskn=
  2184. elseif 2 == selected then
  2185. NameListFormat = 'vanc'; -- override whatever |name-list-format= might be
  2186. a, author_etal = parse_vauthors_veditors (args, args.vauthors, 'AuthorList'); -- fetch author list from |vauthors=, |author-linkn=, and |author-maskn=
  2187. elseif 3 == selected then
  2188. Authors = A['Authors']; -- use content of |authors=
  2189. end
  2190. end
  2191. local Coauthors = A['Coauthors'];
  2192. local Others = A['Others'];
  2193. local editor_etal;
  2194. local e = {}; -- editors list from |editor-lastn= / |editor-firstn= pairs or |veditors=
  2195. local Editors;
  2196. do -- to limit scope of selected
  2197. local selected = select_author_editor_source (A['Veditors'], A['Editors'], args, 'EditorList');
  2198. if 1 == selected then
  2199. e, editor_etal = extract_names (args, 'EditorList'); -- fetch editor list from |editorn= / |editor-lastn= / |editor-firstn=, |editor-linkn=, and |editor-maskn=
  2200. elseif 2 == selected then
  2201. NameListFormat = 'vanc'; -- override whatever |name-list-format= might be
  2202. e, editor_etal = parse_vauthors_veditors (args, args.veditors, 'EditorList'); -- fetch editor list from |veditors=, |editor-linkn=, and |editor-maskn=
  2203. elseif 3 == selected then
  2204. Editors = A['Editors']; -- use content of |editors=
  2205. end
  2206. end
  2207. local t = {}; -- translators list from |translator-lastn= / translator-firstn= pairs
  2208. local Translators; -- assembled translators name list
  2209. t = extract_names (args, 'TranslatorList'); -- fetch translator list from |translatorn= / |translator-lastn=, -firstn=, -linkn=, -maskn=
  2210. local c = {}; -- contributors list from |contributor-lastn= / contributor-firstn= pairs
  2211. local Contributors; -- assembled contributors name list
  2212. local Contribution = A['Contribution'];
  2213. if in_array(config.CitationClass, {"book","citation"}) and not is_set(A['Periodical']) then -- |contributor= and |contribution= only supported in book cites
  2214. c = extract_names (args, 'ContributorList'); -- fetch contributor list from |contributorn= / |contributor-lastn=, -firstn=, -linkn=, -maskn=
  2215. if 0 < #c then
  2216. if not is_set (Contribution) then -- |contributor= requires |contribution=
  2217. table.insert( z.message_tail, { set_error( 'contributor_missing_required_param', 'contribution')}); -- add missing contribution error message
  2218. c = {}; -- blank the contributors' table; it is used as a flag later
  2219. end
  2220. if 0 == #a then -- |contributor= requires |author=
  2221. table.insert( z.message_tail, { set_error( 'contributor_missing_required_param', 'author')}); -- add missing author error message
  2222. c = {}; -- blank the contributors' table; it is used as a flag later
  2223. end
  2224. end
  2225. else -- if not a book cite
  2226. if select_one (args, cfg.aliases['ContributorList-Last'], 'redundant_parameters', 1 ) then -- are there contributor name list parameters?
  2227. table.insert( z.message_tail, { set_error( 'contributor_ignored')}); -- add contributor ignored error message
  2228. end
  2229. Contribution = nil; -- unset
  2230. end
  2231. if not is_valid_parameter_value (NameListFormat, 'name-list-format', cfg.keywords['name-list-format']) then -- only accepted value for this parameter is 'vanc'
  2232. NameListFormat = ''; -- anything else, set to empty string
  2233. end
  2234. local Year = A['Year'];
  2235. local PublicationDate = A['PublicationDate'];
  2236. local OrigYear = A['OrigYear'];
  2237. local Date = A['Date'];
  2238. local LayDate = A['LayDate'];
  2239. ------------------------------------------------- Get title data
  2240. local Title = A['Title'];
  2241. local ScriptTitle = A['ScriptTitle'];
  2242. local BookTitle = A['BookTitle'];
  2243. local Conference = A['Conference'];
  2244. local TransTitle = A['TransTitle'];
  2245. local TitleNote = A['TitleNote'];
  2246. local TitleLink = A['TitleLink'];
  2247. if is_set (TitleLink) and false == link_param_ok (TitleLink) then
  2248. table.insert( z.message_tail, { set_error( 'bad_paramlink', A:ORIGIN('TitleLink'))}); -- url or wikilink in |title-link=;
  2249. end
  2250. local Chapter = A['Chapter'];
  2251. local ScriptChapter = A['ScriptChapter'];
  2252. local ChapterLink -- = A['ChapterLink']; -- deprecated as a parameter but still used internally by cite episode
  2253. local TransChapter = A['TransChapter'];
  2254. local TitleType = A['TitleType'];
  2255. local Degree = A['Degree'];
  2256. local Docket = A['Docket'];
  2257. local ArchiveFormat = A['ArchiveFormat'];
  2258. local ArchiveURL = A['ArchiveURL'];
  2259. local URL = A['URL']
  2260. local URLorigin = A:ORIGIN('URL'); -- get name of parameter that holds URL
  2261. local ChapterURL = A['ChapterURL'];
  2262. local ChapterURLorigin = A:ORIGIN('ChapterURL'); -- get name of parameter that holds ChapterURL
  2263. local ConferenceFormat = A['ConferenceFormat'];
  2264. local ConferenceURL = A['ConferenceURL'];
  2265. local ConferenceURLorigin = A:ORIGIN('ConferenceURL'); -- get name of parameter that holds ConferenceURL
  2266. local Periodical = A['Periodical'];
  2267. local Periodical_origin = A:ORIGIN('Periodical'); -- get the name of the periodical parameter
  2268. local Series = A['Series'];
  2269. local Volume;
  2270. local Issue;
  2271. local Page;
  2272. local Pages;
  2273. local At;
  2274. if in_array (config.CitationClass, cfg.templates_using_volume) and not ('conference' == config.CitationClass and not is_set (Periodical)) then
  2275. Volume = A['Volume'];
  2276. end
  2277. if in_array (config.CitationClass, cfg.templates_using_issue) and not (in_array (config.CitationClass, {'conference', 'map'}) and not is_set (Periodical))then
  2278. Issue = A['Issue'];
  2279. end
  2280. local Position = '';
  2281. if not in_array (config.CitationClass, cfg.templates_not_using_page) then
  2282. Page = A['Page'];
  2283. Pages = hyphen_to_dash( A['Pages'] );
  2284. At = A['At'];
  2285. end
  2286. local Edition = A['Edition'];
  2287. local PublicationPlace = A['PublicationPlace']
  2288. local Place = A['Place'];
  2289. local PublisherName = A['PublisherName'];
  2290. local RegistrationRequired = A['RegistrationRequired'];
  2291. if not is_valid_parameter_value (RegistrationRequired, 'registration', cfg.keywords ['yes_true_y']) then
  2292. RegistrationRequired=nil;
  2293. end
  2294. local SubscriptionRequired = A['SubscriptionRequired'];
  2295. if not is_valid_parameter_value (SubscriptionRequired, 'subscription', cfg.keywords ['yes_true_y']) then
  2296. SubscriptionRequired=nil;
  2297. end
  2298. local Via = A['Via'];
  2299. local AccessDate = A['AccessDate'];
  2300. local ArchiveDate = A['ArchiveDate'];
  2301. local Agency = A['Agency'];
  2302. local DeadURL = A['DeadURL']
  2303. if not is_valid_parameter_value (DeadURL, 'dead-url', cfg.keywords ['deadurl']) then -- set in config.defaults to 'yes'
  2304. DeadURL = ''; -- anything else, set to empty string
  2305. end
  2306. local Language = A['Language'];
  2307. local Format = A['Format'];
  2308. local ChapterFormat = A['ChapterFormat'];
  2309. local DoiBroken = A['DoiBroken'];
  2310. local ID = A['ID'];
  2311. local ASINTLD = A['ASINTLD'];
  2312. local IgnoreISBN = A['IgnoreISBN'];
  2313. if not is_valid_parameter_value (IgnoreISBN, 'ignore-isbn-error', cfg.keywords ['yes_true_y']) then
  2314. IgnoreISBN = nil; -- anything else, set to empty string
  2315. end
  2316. local Embargo = A['Embargo'];
  2317. local Class = A['Class']; -- arxiv class identifier
  2318. local ID_list = extract_ids( args );
  2319. local Quote = A['Quote'];
  2320. local LayFormat = A['LayFormat'];
  2321. local LayURL = A['LayURL'];
  2322. local LaySource = A['LaySource'];
  2323. local Transcript = A['Transcript'];
  2324. local TranscriptFormat = A['TranscriptFormat'];
  2325. local TranscriptURL = A['TranscriptURL']
  2326. local TranscriptURLorigin = A:ORIGIN('TranscriptURL'); -- get name of parameter that holds TranscriptURL
  2327. local LastAuthorAmp = A['LastAuthorAmp'];
  2328. if not is_valid_parameter_value (LastAuthorAmp, 'last-author-amp', cfg.keywords ['yes_true_y']) then
  2329. LastAuthorAmp = nil; -- set to empty string
  2330. end
  2331. local no_tracking_cats = A['NoTracking'];
  2332. if not is_valid_parameter_value (no_tracking_cats, 'no-tracking', cfg.keywords ['yes_true_y']) then
  2333. no_tracking_cats = nil; -- set to empty string
  2334. end
  2335. --these are used by cite interview
  2336. local Callsign = A['Callsign'];
  2337. local City = A['City'];
  2338. local Program = A['Program'];
  2339. --local variables that are not cs1 parameters
  2340. local use_lowercase; -- controls capitalization of certain static text
  2341. local this_page = mw.title.getCurrentTitle(); -- also used for COinS and for language
  2342. local anchor_year; -- used in the CITEREF identifier
  2343. local COinS_date = {}; -- holds date info extracted from |date= for the COinS metadata by Module:Date verification
  2344. -- set default parameter values defined by |mode= parameter. If |mode= is empty or omitted, use CitationClass to set these values
  2345. local Mode = A['Mode'];
  2346. if not is_valid_parameter_value (Mode, 'mode', cfg.keywords['mode']) then
  2347. Mode = '';
  2348. end
  2349. local sepc; -- separator between citation elements for CS1 a period, for CS2, a comma
  2350. local PostScript;
  2351. local Ref;
  2352. sepc, PostScript, Ref = set_style (Mode:lower(), A['PostScript'], A['Ref'], config.CitationClass);
  2353. use_lowercase = ( sepc == ',' ); -- used to control capitalization for certain static text
  2354. --check this page to see if it is in one of the namespaces that cs1 is not supposed to add to the error categories
  2355. if not is_set (no_tracking_cats) then -- ignore if we are already not going to categorize this page
  2356. if in_array (this_page.nsText, cfg.uncategorized_namespaces) then
  2357. no_tracking_cats = "true"; -- set no_tracking_cats
  2358. end
  2359. for _,v in ipairs (cfg.uncategorized_subpages) do -- cycle through page name patterns
  2360. if this_page.text:match (v) then -- test page name against each pattern
  2361. no_tracking_cats = "true"; -- set no_tracking_cats
  2362. break; -- bail out if one is found
  2363. end
  2364. end
  2365. end
  2366. -- check for extra |page=, |pages= or |at= parameters. (also sheet and sheets while we're at it)
  2367. select_one( args, {'page', 'p', 'pp', 'pages', 'at', 'sheet', 'sheets'}, 'redundant_parameters' ); -- this is a dummy call simply to get the error message and category
  2368. local NoPP = A['NoPP']
  2369. if is_set (NoPP) and is_valid_parameter_value (NoPP, 'nopp', cfg.keywords ['yes_true_y']) then
  2370. NoPP = true;
  2371. else
  2372. NoPP = nil; -- unset, used as a flag later
  2373. end
  2374. if is_set(Page) then
  2375. if is_set(Pages) or is_set(At) then
  2376. Pages = ''; -- unset the others
  2377. At = '';
  2378. end
  2379. extra_text_in_page_check (Page); -- add this page to maint cat if |page= value begins with what looks like p. or pp.
  2380. elseif is_set(Pages) then
  2381. if is_set(At) then
  2382. At = ''; -- unset
  2383. end
  2384. extra_text_in_page_check (Pages); -- add this page to maint cat if |pages= value begins with what looks like p. or pp.
  2385. end
  2386. -- both |publication-place= and |place= (|location=) allowed if different
  2387. if not is_set(PublicationPlace) and is_set(Place) then
  2388. PublicationPlace = Place; -- promote |place= (|location=) to |publication-place
  2389. end
  2390. if PublicationPlace == Place then Place = ''; end -- don't need both if they are the same
  2391. --[[
  2392. Parameter remapping for cite encyclopedia:
  2393. When the citation has these parameters:
  2394. |encyclopedia and |title then map |title to |article and |encyclopedia to |title
  2395. |encyclopedia and |article then map |encyclopedia to |title
  2396. |encyclopedia then map |encyclopedia to |title
  2397. |trans_title maps to |trans_chapter when |title is re-mapped
  2398. |url maps to |chapterurl when |title is remapped
  2399. All other combinations of |encyclopedia, |title, and |article are not modified
  2400. ]]
  2401. local Encyclopedia = A['Encyclopedia'];
  2402. if ( config.CitationClass == "encyclopaedia" ) or ( config.CitationClass == "citation" and is_set (Encyclopedia)) then -- test code for citation
  2403. if is_set(Periodical) then -- Periodical is set when |encyclopedia is set
  2404. if is_set(Title) or is_set (ScriptTitle) then
  2405. if not is_set(Chapter) then
  2406. Chapter = Title; -- |encyclopedia and |title are set so map |title to |article and |encyclopedia to |title
  2407. ScriptChapter = ScriptTitle;
  2408. TransChapter = TransTitle;
  2409. ChapterURL = URL;
  2410. if not is_set (ChapterURL) and is_set (TitleLink) then
  2411. Chapter= '[[' .. TitleLink .. '|' .. Chapter .. ']]';
  2412. end
  2413. Title = Periodical;
  2414. ChapterFormat = Format;
  2415. Periodical = ''; -- redundant so unset
  2416. TransTitle = '';
  2417. URL = '';
  2418. Format = '';
  2419. TitleLink = '';
  2420. ScriptTitle = '';
  2421. end
  2422. else -- |title not set
  2423. Title = Periodical; -- |encyclopedia set and |article set or not set so map |encyclopedia to |title
  2424. Periodical = ''; -- redundant so unset
  2425. end
  2426. end
  2427. end
  2428. -- Special case for cite techreport.
  2429. if (config.CitationClass == "techreport") then -- special case for cite techreport
  2430. if is_set(A['Number']) then -- cite techreport uses 'number', which other citations alias to 'issue'
  2431. if not is_set(ID) then -- can we use ID for the "number"?
  2432. ID = A['Number']; -- yes, use it
  2433. else -- ID has a value so emit error message
  2434. table.insert( z.message_tail, { set_error('redundant_parameters', {wrap_style ('parameter', 'id') .. ' and ' .. wrap_style ('parameter', 'number')}, true )});
  2435. end
  2436. end
  2437. end
  2438. -- special case for cite interview
  2439. if (config.CitationClass == "interview") then
  2440. if is_set(Program) then
  2441. ID = ' ' .. Program;
  2442. end
  2443. if is_set(Callsign) then
  2444. if is_set(ID) then
  2445. ID = ID .. sepc .. ' ' .. Callsign;
  2446. else
  2447. ID = ' ' .. Callsign;
  2448. end
  2449. end
  2450. if is_set(City) then
  2451. if is_set(ID) then
  2452. ID = ID .. sepc .. ' ' .. City;
  2453. else
  2454. ID = ' ' .. City;
  2455. end
  2456. end
  2457. if is_set(Others) then
  2458. if is_set(TitleType) then
  2459. Others = ' ' .. TitleType .. ' with ' .. Others;
  2460. TitleType = '';
  2461. else
  2462. Others = ' ' .. 'Interview with ' .. Others;
  2463. end
  2464. else
  2465. Others = '(Interview)';
  2466. end
  2467. end
  2468. -- special case for cite mailing list
  2469. if (config.CitationClass == "mailinglist") then
  2470. Periodical = A ['MailingList'];
  2471. elseif 'mailinglist' == A:ORIGIN('Periodical') then
  2472. Periodical = ''; -- unset because mailing list is only used for cite mailing list
  2473. end
  2474. -- Account for the oddity that is {{cite conference}}, before generation of COinS data.
  2475. if 'conference' == config.CitationClass then
  2476. if is_set(BookTitle) then
  2477. Chapter = Title;
  2478. -- ChapterLink = TitleLink; -- |chapterlink= is deprecated
  2479. ChapterURL = URL;
  2480. ChapterURLorigin = URLorigin;
  2481. URLorigin = '';
  2482. ChapterFormat = Format;
  2483. TransChapter = TransTitle;
  2484. Title = BookTitle;
  2485. Format = '';
  2486. -- TitleLink = '';
  2487. TransTitle = '';
  2488. URL = '';
  2489. end
  2490. elseif 'speech' ~= config.CitationClass then
  2491. Conference = ''; -- not cite conference or cite speech so make sure this is empty string
  2492. end
  2493. -- cite map oddities
  2494. local Cartography = "";
  2495. local Scale = "";
  2496. local Sheet = A['Sheet'] or '';
  2497. local Sheets = A['Sheets'] or '';
  2498. if config.CitationClass == "map" then
  2499. Chapter = A['Map'];
  2500. ChapterURL = A['MapURL'];
  2501. TransChapter = A['TransMap'];
  2502. ChapterURLorigin = A:ORIGIN('MapURL');
  2503. ChapterFormat = A['MapFormat'];
  2504. Cartography = A['Cartography'];
  2505. if is_set( Cartography ) then
  2506. Cartography = sepc .. " " .. wrap_msg ('cartography', Cartography, use_lowercase);
  2507. end
  2508. Scale = A['Scale'];
  2509. if is_set( Scale ) then
  2510. Scale = sepc .. " " .. Scale;
  2511. end
  2512. end
  2513. -- Account for the oddities that are {{cite episode}} and {{cite serial}}, before generation of COinS data.
  2514. if 'episode' == config.CitationClass or 'serial' == config.CitationClass then
  2515. local AirDate = A['AirDate'];
  2516. local SeriesLink = A['SeriesLink'];
  2517. if is_set (SeriesLink) and false == link_param_ok (SeriesLink) then
  2518. table.insert( z.message_tail, { set_error( 'bad_paramlink', A:ORIGIN('SeriesLink'))});
  2519. end
  2520. local Network = A['Network'];
  2521. local Station = A['Station'];
  2522. local s, n = {}, {};
  2523. -- do common parameters first
  2524. if is_set(Network) then table.insert(n, Network); end
  2525. if is_set(Station) then table.insert(n, Station); end
  2526. ID = table.concat(n, sepc .. ' ');
  2527. if not is_set (Date) and is_set (AirDate) then -- promote airdate to date
  2528. Date = AirDate;
  2529. end
  2530. if 'episode' == config.CitationClass then -- handle the oddities that are strictly {{cite episode}}
  2531. local Season = A['Season'];
  2532. local SeriesNumber = A['SeriesNumber'];
  2533. if is_set (Season) and is_set (SeriesNumber) then -- these are mutually exclusive so if both are set
  2534. table.insert( z.message_tail, { set_error( 'redundant_parameters', {wrap_style ('parameter', 'season') .. ' and ' .. wrap_style ('parameter', 'seriesno')}, true ) } ); -- add error message
  2535. SeriesNumber = ''; -- unset; prefer |season= over |seriesno=
  2536. end
  2537. -- assemble a table of parts concatenated later into Series
  2538. if is_set(Season) then table.insert(s, wrap_msg ('season', Season, use_lowercase)); end
  2539. if is_set(SeriesNumber) then table.insert(s, wrap_msg ('series', SeriesNumber, use_lowercase)); end
  2540. if is_set(Issue) then table.insert(s, wrap_msg ('episode', Issue, use_lowercase)); end
  2541. Issue = ''; -- unset because this is not a unique parameter
  2542. Chapter = Title; -- promote title parameters to chapter
  2543. ScriptChapter = ScriptTitle;
  2544. ChapterLink = TitleLink; -- alias episodelink
  2545. TransChapter = TransTitle;
  2546. ChapterURL = URL;
  2547. ChapterURLorigin = A:ORIGIN('URL');
  2548. Title = Series; -- promote series to title
  2549. TitleLink = SeriesLink;
  2550. Series = table.concat(s, sepc .. ' '); -- this is concatenation of season, seriesno, episode number
  2551. if is_set (ChapterLink) and not is_set (ChapterURL) then -- link but not URL
  2552. Chapter = '[[' .. ChapterLink .. '|' .. Chapter .. ']]'; -- ok to wikilink
  2553. elseif is_set (ChapterLink) and is_set (ChapterURL) then -- if both are set, URL links episode;
  2554. Series = '[[' .. ChapterLink .. '|' .. Series .. ']]'; -- series links with ChapterLink (episodelink -> TitleLink -> ChapterLink) ugly
  2555. end
  2556. URL = ''; -- unset
  2557. TransTitle = '';
  2558. ScriptTitle = '';
  2559. else -- now oddities that are cite serial
  2560. Issue = ''; -- unset because this parameter no longer supported by the citation/core version of cite serial
  2561. Chapter = A['Episode']; -- TODO: make |episode= available to cite episode someday?
  2562. if is_set (Series) and is_set (SeriesLink) then
  2563. Series = '[[' .. SeriesLink .. '|' .. Series .. ']]';
  2564. end
  2565. Series = wrap_style ('italic-title', Series); -- series is italicized
  2566. end
  2567. end
  2568. -- end of {{cite episode}} stuff
  2569. -- Account for the oddities that are {{cite arxiv}}, before generation of COinS data.
  2570. if 'arxiv' == config.CitationClass then
  2571. if not is_set (ID_list['ARXIV']) then -- |arxiv= or |eprint= required for cite arxiv
  2572. table.insert( z.message_tail, { set_error( 'arxiv_missing', {}, true ) } ); -- add error message
  2573. elseif is_set (Series) then -- series is an alias of version
  2574. ID_list['ARXIV'] = ID_list['ARXIV'] .. Series; -- concatenate version onto the end of the arxiv identifier
  2575. Series = ''; -- unset
  2576. deprecated_parameter ('version'); -- deprecated parameter but only for cite arxiv
  2577. end
  2578. if first_set ({AccessDate, At, Chapter, Format, Page, Pages, Periodical, PublisherName, URL, -- a crude list of parameters that are not supported by cite arxiv
  2579. ID_list['ASIN'], ID_list['BIBCODE'], ID_list['DOI'], ID_list['ISBN'], ID_list['ISSN'],
  2580. ID_list['JFM'], ID_list['JSTOR'], ID_list['LCCN'], ID_list['MR'], ID_list['OCLC'], ID_list['OL'],
  2581. ID_list['OSTI'], ID_list['PMC'], ID_list['PMID'], ID_list['RFC'], ID_list['SSRN'], ID_list['USENETID'], ID_list['ZBL']},27) then
  2582. table.insert( z.message_tail, { set_error( 'arxiv_params_not_supported', {}, true ) } ); -- add error message
  2583. AccessDate= ''; -- set these to empty string; not supported in cite arXiv
  2584. PublisherName = ''; -- (if the article has been published, use cite journal, or other)
  2585. Chapter = '';
  2586. URL = '';
  2587. Format = '';
  2588. Page = ''; Pages = ''; At = '';
  2589. end
  2590. Periodical = 'arXiv'; -- set to arXiv for COinS; after that, must be set to empty string
  2591. end
  2592. -- handle type parameter for those CS1 citations that have default values
  2593. if in_array(config.CitationClass, {"AV-media-notes", "DVD-notes", "mailinglist", "map", "podcast", "pressrelease", "report", "techreport", "thesis"}) then
  2594. TitleType = set_titletype (config.CitationClass, TitleType);
  2595. if is_set(Degree) and "Thesis" == TitleType then -- special case for cite thesis
  2596. TitleType = Degree .. "论文";
  2597. end
  2598. end
  2599. if is_set(TitleType) then -- if type parameter is specified
  2600. TitleType = substitute( cfg.messages['type'], TitleType); -- display it in parentheses
  2601. end
  2602. -- legacy: promote concatenation of |month=, and |year= to Date if Date not set; or, promote PublicationDate to Date if neither Date nor Year are set.
  2603. if not is_set (Date) then
  2604. Date = Year; -- promote Year to Date
  2605. Year = nil; -- make nil so Year as empty string isn't used for CITEREF
  2606. if not is_set (Date) and is_set(PublicationDate) then -- use PublicationDate when |date= and |year= are not set
  2607. Date = PublicationDate; -- promote PublicationDate to Date
  2608. PublicationDate = ''; -- unset, no longer needed
  2609. end
  2610. end
  2611. if PublicationDate == Date then PublicationDate = ''; end -- if PublicationDate is same as Date, don't display in rendered citation
  2612. --[[
  2613. Go test all of the date-holding parameters for valid MOS:DATE format and make sure that dates are real dates. This must be done before we do COinS because here is where
  2614. we get the date used in the metadata.
  2615. Date validation supporting code is in Module:Citation/CS1/Date_validation
  2616. ]]
  2617. do -- create defined block to contain local variables error_message and mismatch
  2618. local error_message = '';
  2619. -- AirDate has been promoted to Date so not necessary to check it
  2620. anchor_year, error_message = dates({['access-date']=AccessDate, ['archive-date']=ArchiveDate, ['date']=Date, ['doi-broken-date']=DoiBroken,
  2621. ['embargo']=Embargo, ['lay-date']=LayDate, ['publication-date']=PublicationDate, ['year']=Year}, COinS_date);
  2622. if is_set (Year) and is_set (Date) then -- both |date= and |year= not normally needed;
  2623. local mismatch = year_date_check (Year, Date)
  2624. if 0 == mismatch then -- |year= does not match a year-value in |date=
  2625. if is_set (error_message) then -- if there is already an error message
  2626. error_message = error_message .. ', '; -- tack on this additional message
  2627. end
  2628. error_message = error_message .. '&#124;year= / &#124;date= mismatch';
  2629. elseif 1 == mismatch then -- |year= matches year-value in |date=
  2630. add_maint_cat ('date_year');
  2631. end
  2632. end
  2633. if is_set(error_message) then
  2634. table.insert( z.message_tail, { set_error( 'bad_date', {error_message}, true ) } ); -- add this error message
  2635. end
  2636. end -- end of do
  2637. -- Account for the oddity that is {{cite journal}} with |pmc= set and |url= not set. Do this after date check but before COInS.
  2638. -- Here we unset Embargo if PMC not embargoed (|embargo= not set in the citation) or if the embargo time has expired. Otherwise, holds embargo date
  2639. Embargo = is_embargoed (Embargo); --
  2640. if config.CitationClass == "journal" and not is_set(URL) and is_set(ID_list['PMC']) then
  2641. if not is_set (Embargo) then -- if not embargoed or embargo has expired
  2642. URL=cfg.id_handlers['PMC'].prefix .. ID_list['PMC']; -- set url to be the same as the PMC external link if not embargoed
  2643. URLorigin = cfg.id_handlers['PMC'].parameters[1]; -- set URLorigin to parameter name for use in error message if citation is missing a |title=
  2644. end
  2645. end
  2646. -- At this point fields may be nil if they weren't specified in the template use. We can use that fact.
  2647. -- Test if citation has no title
  2648. if not is_set(Title) and
  2649. not is_set(TransTitle) and
  2650. not is_set(ScriptTitle) then
  2651. if 'episode' == config.CitationClass then -- special case for cite episode; TODO: is there a better way to do this?
  2652. table.insert( z.message_tail, { set_error( 'citation_missing_title', {'series'}, true ) } );
  2653. else
  2654. table.insert( z.message_tail, { set_error( 'citation_missing_title', {'title'}, true ) } );
  2655. end
  2656. end
  2657. if 'none' == Title and in_array (config.CitationClass, {'journal', 'citation'}) and is_set (Periodical) and 'journal' == A:ORIGIN('Periodical') then -- special case for journal cites
  2658. Title = ''; -- set title to empty string
  2659. add_maint_cat ('untitled');
  2660. end
  2661. check_for_url ({ -- add error message when any of these parameters contains a URL
  2662. ['title']=Title,
  2663. [A:ORIGIN('Chapter')]=Chapter,
  2664. [A:ORIGIN('Periodical')]=Periodical,
  2665. [A:ORIGIN('PublisherName')] = PublisherName,
  2666. });
  2667. -- COinS metadata (see <http://ocoins.info/>) for automated parsing of citation information.
  2668. -- handle the oddity that is cite encyclopedia and {{citation |encyclopedia=something}}. Here we presume that
  2669. -- when Periodical, Title, and Chapter are all set, then Periodical is the book (encyclopedia) title, Title
  2670. -- is the article title, and Chapter is a section within the article. So, we remap
  2671. local coins_chapter = Chapter; -- default assuming that remapping not required
  2672. local coins_title = Title; -- et tu
  2673. if 'encyclopaedia' == config.CitationClass or ('citation' == config.CitationClass and is_set (Encyclopedia)) then
  2674. if is_set (Chapter) and is_set (Title) and is_set (Periodical) then -- if all are used then
  2675. coins_chapter = Title; -- remap
  2676. coins_title = Periodical;
  2677. end
  2678. end
  2679. local coins_author = a; -- default for coins rft.au
  2680. if 0 < #c then -- but if contributor list
  2681. coins_author = c; -- use that instead
  2682. end
  2683. -- this is the function call to COinS()
  2684. local OCinSoutput = COinS({
  2685. ['Periodical'] = Periodical,
  2686. ['Encyclopedia'] = Encyclopedia,
  2687. ['Chapter'] = make_coins_title (coins_chapter, ScriptChapter), -- Chapter and ScriptChapter stripped of bold / italic wikimarkup
  2688. ['Map'] = Map,
  2689. ['Degree'] = Degree; -- cite thesis only
  2690. ['Title'] = make_coins_title (coins_title, ScriptTitle), -- Title and ScriptTitle stripped of bold / italic wikimarkup
  2691. ['PublicationPlace'] = PublicationPlace,
  2692. ['Date'] = COinS_date.rftdate, -- COinS_date has correctly formatted date if Date is valid;
  2693. ['Season'] = COinS_date.rftssn,
  2694. ['Chron'] = COinS_date.rftchron or (not COinS_date.rftdate and Date) or '', -- chron but if not set and invalid date format use Date; keep this last bit?
  2695. ['Series'] = Series,
  2696. ['Volume'] = Volume,
  2697. ['Issue'] = Issue,
  2698. ['Pages'] = get_coins_pages (first_set ({Sheet, Sheets, Page, Pages, At}, 5)), -- pages stripped of external links
  2699. ['Edition'] = Edition,
  2700. ['PublisherName'] = PublisherName,
  2701. ['URL'] = first_set ({ChapterURL, URL}, 2),
  2702. ['Authors'] = coins_author,
  2703. ['ID_list'] = ID_list,
  2704. ['RawPage'] = this_page.prefixedText,
  2705. }, config.CitationClass);
  2706. -- Account for the oddities that are {{cite arxiv}}, AFTER generation of COinS data.
  2707. if 'arxiv' == config.CitationClass then -- we have set rft.jtitle in COinS to arXiv, now unset so it isn't displayed
  2708. Periodical = ''; -- periodical not allowed in cite arxiv; if article has been published, use cite journal
  2709. end
  2710. -- special case for cite newsgroup. Do this after COinS because we are modifying Publishername to include some static text
  2711. if 'newsgroup' == config.CitationClass then
  2712. if is_set (PublisherName) then
  2713. PublisherName = substitute (cfg.messages['newsgroup'], external_link( 'news:' .. PublisherName, PublisherName, A:ORIGIN('PublisherName') ));
  2714. end
  2715. end
  2716. -- Now perform various field substitutions.
  2717. -- We also add leading spaces and surrounding markup and punctuation to the
  2718. -- various parts of the citation, but only when they are non-nil.
  2719. local EditorCount; -- used only for choosing {ed.) or (eds.) annotation at end of editor name-list
  2720. do
  2721. local last_first_list;
  2722. local maximum;
  2723. local control = {
  2724. format = NameListFormat, -- empty string or 'vanc'
  2725. maximum = nil, -- as if display-authors or display-editors not set
  2726. lastauthoramp = LastAuthorAmp,
  2727. page_name = this_page.text -- get current page name so that we don't wikilink to it via editorlinkn
  2728. };
  2729. do -- do editor name list first because coauthors can modify control table
  2730. maximum , editor_etal = get_display_authors_editors (A['DisplayEditors'], #e, 'editors', editor_etal);
  2731. --[[ Preserve old-style implicit et al.
  2732. 临时修复"Category:含有旧式缩略标签的引用的页面 in editors"的问题,中文版目前与英文版逻辑不一样,暂时不需要这个分类。等以后更新时再看怎么处理 --2017.6.23 shizhao
  2733. if not is_set(maximum) and #e == 4 then
  2734. maximum = 3;
  2735. table.insert( z.message_tail, { set_error('implict_etal_editor', {}, true) } );
  2736. end
  2737. ]]
  2738. control.maximum = maximum;
  2739. last_first_list, EditorCount = list_people(control, e, editor_etal, 'editor');
  2740. if is_set (Editors) then
  2741. if editor_etal then
  2742. Editors = Editors .. ' ' .. cfg.messages['et al']; -- add et al. to editors parameter beause |display-editors=etal
  2743. EditorCount = 2; -- with et al., |editors= is multiple names; spoof to display (eds.) annotation
  2744. else
  2745. EditorCount = 2; -- we don't know but assume |editors= is multiple names; spoof to display (eds.) annotation
  2746. end
  2747. else
  2748. Editors = last_first_list; -- either an author name list or an empty string
  2749. end
  2750. if 1 == EditorCount and (true == editor_etal or 1 < #e) then -- only one editor displayed but includes etal then
  2751. EditorCount = 2; -- spoof to display (eds.) annotation
  2752. end
  2753. end
  2754. do -- now do translators
  2755. control.maximum = #t; -- number of translators
  2756. Translators = list_people(control, t, false, 'translator'); -- et al not currently supported
  2757. end
  2758. do -- now do contributors
  2759. control.maximum = #c; -- number of contributors
  2760. Contributors = list_people(control, c, false, 'contributor'); -- et al not currently supported
  2761. end
  2762. do -- now do authors
  2763. control.maximum , author_etal = get_display_authors_editors (A['DisplayAuthors'], #a, 'authors', author_etal);
  2764. if is_set(Coauthors) then -- if the coauthor field is also used, prevent ampersand and et al. formatting.
  2765. control.lastauthoramp = nil;
  2766. control.maximum = #a + 1;
  2767. end
  2768. last_first_list = list_people(control, a, author_etal, 'author');
  2769. if is_set (Authors) then
  2770. Authors, author_etal = name_has_etal (Authors, author_etal, false); -- find and remove variations on et al.
  2771. if author_etal then
  2772. Authors = Authors .. ' ' .. cfg.messages['et al']; -- add et al. to authors parameter
  2773. end
  2774. else
  2775. Authors = last_first_list; -- either an author name list or an empty string
  2776. end
  2777. end -- end of do
  2778. if not is_set(Authors) and is_set(Coauthors) then -- coauthors aren't displayed if one of authors=, authorn=, or lastn= isn't specified
  2779. table.insert( z.message_tail, { set_error('coauthors_missing_author', {}, true) } ); -- emit error message
  2780. end
  2781. end
  2782. -- apply |[xx-]format= styling; at the end, these parameters hold correctly styled format annotation,
  2783. -- an error message if the associated url is not set, or an empty string for concatenation
  2784. ArchiveFormat = style_format (ArchiveFormat, ArchiveURL, 'archive-format', 'archive-url');
  2785. ConferenceFormat = style_format (ConferenceFormat, ConferenceURL, 'conference-format', 'conference-url');
  2786. Format = style_format (Format, URL, 'format', 'url');
  2787. LayFormat = style_format (LayFormat, LayURL, 'lay-format', 'lay-url');
  2788. TranscriptFormat = style_format (TranscriptFormat, TranscriptURL, 'transcript-format', 'transcripturl');
  2789. -- special case for chapter format so no error message or cat when chapter not supported
  2790. if not (in_array(config.CitationClass, {'web','news','journal', 'magazine', 'pressrelease','podcast', 'newsgroup', 'arxiv'}) or
  2791. ('citation' == config.CitationClass and is_set (Periodical) and not is_set (Encyclopedia))) then
  2792. ChapterFormat = style_format (ChapterFormat, ChapterURL, 'chapter-format', 'chapter-url');
  2793. end
  2794. if not is_set(URL) then --and
  2795. if in_array(config.CitationClass, {"web","podcast", "mailinglist"}) then -- Test if cite web or cite podcast |url= is missing or empty
  2796. table.insert( z.message_tail, { set_error( 'cite_web_url', {}, true ) } );
  2797. end
  2798. -- Test if accessdate is given without giving a URL
  2799. if is_set(AccessDate) and not is_set(ChapterURL)then -- ChapterURL may be set when the others are not set; TODO: move this to a separate test?
  2800. table.insert( z.message_tail, { set_error( 'accessdate_missing_url', {}, true ) } );
  2801. AccessDate = '';
  2802. end
  2803. end
  2804. local OriginalURL, OriginalURLorigin, OriginalFormat; -- TODO: swap chapter and title here so that archive applies to most specific if both are set?
  2805. DeadURL = DeadURL:lower(); -- used later when assembling archived text
  2806. if is_set( ArchiveURL ) then
  2807. if is_set (URL) then
  2808. OriginalURL = URL; -- save copy of original source URL
  2809. OriginalURLorigin = URLorigin; -- name of url parameter for error messages
  2810. OriginalFormat = Format; -- and original |format=
  2811. if 'no' ~= DeadURL then -- if URL set then archive-url applies to it
  2812. URL = ArchiveURL -- swap-in the archive's url
  2813. URLorigin = A:ORIGIN('ArchiveURL') -- name of archive url parameter for error messages
  2814. Format = ArchiveFormat or ''; -- swap in archive's format
  2815. end
  2816. elseif is_set (ChapterURL) then -- URL not set so if chapter-url is set apply archive url to it
  2817. OriginalURL = ChapterURL; -- save copy of source chapter's url for archive text
  2818. OriginalURLorigin = ChapterURLorigin; -- name of chapter-url parameter for error messages
  2819. OriginalFormat = ChapterFormat; -- and original |format=
  2820. if 'no' ~= DeadURL then
  2821. ChapterURL = ArchiveURL -- swap-in the archive's url
  2822. ChapterURLorigin = A:ORIGIN('ArchiveURL') -- name of archive-url parameter for error messages
  2823. ChapterFormat = ArchiveFormat or ''; -- swap in archive's format
  2824. end
  2825. end
  2826. end
  2827. if in_array(config.CitationClass, {'web','news','journal', 'magazine', 'pressrelease','podcast', 'newsgroup', 'arxiv'}) or -- if any of the 'periodical' cites except encyclopedia
  2828. ('citation' == config.CitationClass and is_set (Periodical) and not is_set (Encyclopedia)) then
  2829. local chap_param;
  2830. if is_set (Chapter) then -- get a parameter name from one of these chapter related meta-parameters
  2831. chap_param = A:ORIGIN ('Chapter')
  2832. elseif is_set (TransChapter) then
  2833. chap_param = A:ORIGIN ('TransChapter')
  2834. elseif is_set (ChapterURL) then
  2835. chap_param = A:ORIGIN ('ChapterURL')
  2836. elseif is_set (ScriptChapter) then
  2837. chap_param = A:ORIGIN ('ScriptChapter')
  2838. else is_set (ChapterFormat)
  2839. chap_param = A:ORIGIN ('ChapterFormat')
  2840. end
  2841. if is_set (chap_param) then -- if we found one
  2842. table.insert( z.message_tail, { set_error( 'chapter_ignored', {chap_param}, true ) } ); -- add error message
  2843. Chapter = ''; -- and set them to empty string to be safe with concatenation
  2844. TransChapter = '';
  2845. ChapterURL = '';
  2846. ScriptChapter = '';
  2847. ChapterFormat = '';
  2848. end
  2849. else -- otherwise, format chapter / article title
  2850. local no_quotes = false; -- default assume that we will be quoting the chapter parameter value
  2851. if is_set (Contribution) and 0 < #c then -- if this is a contribution with contributor(s)
  2852. if in_array (Contribution:lower(), cfg.keywords.contribution) then -- and a generic contribution title
  2853. no_quotes = true; -- then render it unquoted
  2854. end
  2855. end
  2856. Chapter = format_chapter_title (ScriptChapter, Chapter, TransChapter, ChapterURL, ChapterURLorigin, no_quotes); -- Contribution is also in Chapter
  2857. if is_set (Chapter) then
  2858. if 'map' == config.CitationClass and is_set (TitleType) then
  2859. Chapter = Chapter .. ' ' .. TitleType;
  2860. end
  2861. Chapter = Chapter .. ChapterFormat .. sepc .. ' ';
  2862. elseif is_set (ChapterFormat) then -- |chapter= not set but |chapter-format= is so ...
  2863. Chapter = ChapterFormat .. sepc .. ' '; -- ... ChapterFormat has error message, we want to see it
  2864. end
  2865. end
  2866. -- Format main title.
  2867. if is_set(TitleLink) and is_set(Title) then
  2868. Title = "[[" .. TitleLink .. "|" .. Title .. "]]"
  2869. end
  2870. if in_array(config.CitationClass, {'web','news','journal', 'magazine', 'pressrelease','podcast', 'newsgroup', 'mailinglist', 'arxiv'}) or
  2871. ('citation' == config.CitationClass and is_set (Periodical) and not is_set (Encyclopedia)) or
  2872. ('map' == config.CitationClass and is_set (Periodical)) then -- special case for cite map when the map is in a periodical treat as an article
  2873. Title = kern_quotes (Title); -- if necessary, separate title's leading and trailing quote marks from Module provided quote marks
  2874. Title = wrap_style ('quoted-title', Title);
  2875. Title = script_concatenate (Title, ScriptTitle); -- <bdi> tags, lang atribute, categorization, etc; must be done after title is wrapped
  2876. TransTitle= wrap_style ('trans-quoted-title', TransTitle );
  2877. elseif 'report' == config.CitationClass then -- no styling for cite report
  2878. Title = script_concatenate (Title, ScriptTitle); -- <bdi> tags, lang atribute, categorization, etc; must be done after title is wrapped
  2879. TransTitle= wrap_style ('trans-quoted-title', TransTitle ); -- for cite report, use this form for trans-title
  2880. else
  2881. Title = wrap_style ('italic-title', Title);
  2882. Title = script_concatenate (Title, ScriptTitle); -- <bdi> tags, lang atribute, categorization, etc; must be done after title is wrapped
  2883. TransTitle = wrap_style ('trans-italic-title', TransTitle);
  2884. end
  2885. TransError = "";
  2886. if is_set(TransTitle) then
  2887. if is_set(Title) then
  2888. TransTitle = " " .. TransTitle;
  2889. else
  2890. TransError = " " .. set_error( 'trans_missing_title', {'title'} );
  2891. end
  2892. end
  2893. Title = Title .. TransTitle;
  2894. if is_set(Title) then
  2895. if not is_set(TitleLink) and is_set(URL) then
  2896. Title = external_link( URL, Title, URLorigin ) .. TransError .. Format;
  2897. URL = "";
  2898. Format = "";
  2899. else
  2900. Title = Title .. TransError;
  2901. end
  2902. end
  2903. if is_set(Place) then
  2904. Place = " " .. wrap_msg ('written', Place, use_lowercase) .. sepc .. " ";
  2905. end
  2906. if is_set (Conference) then
  2907. if is_set (ConferenceURL) then
  2908. Conference = external_link( ConferenceURL, Conference, ConferenceURLorigin );
  2909. end
  2910. Conference = sepc .. " " .. Conference .. ConferenceFormat;
  2911. elseif is_set(ConferenceURL) then
  2912. Conference = sepc .. " " .. external_link( ConferenceURL, nil, ConferenceURLorigin );
  2913. end
  2914. if not is_set(Position) then
  2915. local Minutes = A['Minutes'];
  2916. local Time = A['Time'];
  2917. if is_set(Minutes) then
  2918. if is_set (Time) then
  2919. table.insert( z.message_tail, { set_error( 'redundant_parameters', {wrap_style ('parameter', 'minutes') .. ' and ' .. wrap_style ('parameter', 'time')}, true ) } );
  2920. end
  2921. Position = " " .. Minutes .. " " .. cfg.messages['minutes'];
  2922. else
  2923. if is_set(Time) then
  2924. local TimeCaption = A['TimeCaption']
  2925. if not is_set(TimeCaption) then
  2926. TimeCaption = cfg.messages['event'];
  2927. if sepc ~= '.' then
  2928. TimeCaption = TimeCaption:lower();
  2929. end
  2930. end
  2931. Position = " " .. TimeCaption .. " " .. Time;
  2932. end
  2933. end
  2934. else
  2935. Position = " " .. Position;
  2936. At = '';
  2937. end
  2938. Page, Pages, Sheet, Sheets = format_pages_sheets (Page, Pages, Sheet, Sheets, config.CitationClass, Periodical_origin, sepc, NoPP, use_lowercase);
  2939. At = is_set(At) and (sepc .. " " .. At) or "";
  2940. Position = is_set(Position) and (sepc .. " " .. Position) or "";
  2941. if config.CitationClass == 'map' then
  2942. local Section = A['Section'];
  2943. local Sections = A['Sections'];
  2944. local Inset = A['Inset'];
  2945. if is_set( Inset ) then
  2946. Inset = sepc .. " " .. wrap_msg ('inset', Inset, use_lowercase);
  2947. end
  2948. if is_set( Sections ) then
  2949. Section = sepc .. " " .. wrap_msg ('sections', Sections, use_lowercase);
  2950. elseif is_set( Section ) then
  2951. Section = sepc .. " " .. wrap_msg ('section', Section, use_lowercase);
  2952. end
  2953. At = At .. Inset .. Section;
  2954. end
  2955. if is_set (Language) then
  2956. Language = language_parameter (Language); -- format, categories, name from ISO639-1, etc
  2957. else
  2958. Language=""; -- language not specified so make sure this is an empty string;
  2959. end
  2960. Others = is_set(Others) and (sepc .. " " .. Others) or "";
  2961. if is_set (Translators) then
  2962. Others = sepc .. ' 由' .. Translators .. '翻译 ' .. Others;
  2963. end
  2964. TitleNote = is_set(TitleNote) and (sepc .. " " .. TitleNote) or "";
  2965. if is_set (Edition) then
  2966. if Edition:match ('%f[%a][Ee]d%.?$') or Edition:match ('%f[%a][Ee]dition$') then
  2967. add_maint_cat ('extra_text', 'edition');
  2968. end
  2969. Edition = " " .. wrap_msg ('edition', Edition);
  2970. else
  2971. Edition = '';
  2972. end
  2973. Series = is_set(Series) and (sepc .. " " .. Series) or "";
  2974. OrigYear = is_set(OrigYear) and (" [" .. OrigYear .. "]") or "";
  2975. Agency = is_set(Agency) and (sepc .. " " .. Agency) or "";
  2976. Volume = format_volume_issue (Volume, Issue, config.CitationClass, Periodical_origin, sepc, use_lowercase);
  2977. ------------------------------------ totally unrelated data
  2978. if is_set(Via) then
  2979. Via = " " .. wrap_msg ('via', Via);
  2980. end
  2981. --[[
  2982. Subscription implies paywall; Registration does not. If both are used in a citation, the subscription required link
  2983. note is displayed. There are no error messages for this condition.
  2984. ]]
  2985. if is_set (SubscriptionRequired) then
  2986. SubscriptionRequired = sepc .. " " .. cfg.messages['subscription']; -- subscription required message
  2987. elseif is_set (RegistrationRequired) then
  2988. SubscriptionRequired = sepc .. " " .. cfg.messages['registration']; -- registration required message
  2989. else
  2990. SubscriptionRequired = ''; -- either or both might be set to something other than yes true y
  2991. end
  2992. if is_set(AccessDate) then
  2993. local retrv_text = " " .. cfg.messages['retrieved']
  2994. AccessDate = nowrap_date (AccessDate); -- wrap in nowrap span if date in appropriate format
  2995. if (sepc ~= ".") then retrv_text = retrv_text:lower() end -- if 'citation', lower case
  2996. AccessDate = substitute (retrv_text, AccessDate); -- add retrieved text
  2997. -- neither of these work; don't know why; it seems that substitute() isn't being called
  2998. AccessDate = substitute (cfg.presentation['accessdate'], {sepc, AccessDate}); -- allow editors to hide accessdates
  2999. end
  3000. if is_set(ID) then ID = sepc .." ".. ID; end
  3001. if "thesis" == config.CitationClass and is_set(Docket) then
  3002. ID = sepc .." Docket ".. Docket .. ID;
  3003. end
  3004. if "report" == config.CitationClass and is_set(Docket) then -- for cite report when |docket= is set
  3005. ID = sepc .. ' ' .. Docket; -- overwrite ID even if |id= is set
  3006. end
  3007. ID_list = build_id_list( ID_list, {DoiBroken = DoiBroken, ASINTLD = ASINTLD, IgnoreISBN = IgnoreISBN, Embargo=Embargo, Class = Class} );
  3008. if is_set(URL) then
  3009. URL = " " .. external_link( URL, nil, URLorigin );
  3010. end
  3011. if is_set(Quote) then
  3012. if Quote:sub(1,1) == '"' and Quote:sub(-1,-1) == '"' then -- if first and last characters of quote are quote marks
  3013. Quote = Quote:sub(2,-2); -- strip them off
  3014. end
  3015. Quote = sepc .." " .. wrap_style ('quoted-text', Quote ); -- wrap in <q>...</q> tags
  3016. PostScript = ""; -- cs1|2 does not supply terminal punctuation when |quote= is set
  3017. end
  3018. local Archived
  3019. if is_set(ArchiveURL) then
  3020. if not is_set(ArchiveDate) then
  3021. ArchiveDate = set_error('archive_missing_date');
  3022. end
  3023. if "no" == DeadURL then
  3024. local arch_text = cfg.messages['archived'];
  3025. if sepc ~= "." then arch_text = arch_text:lower() end
  3026. Archived = sepc .. " " .. substitute( cfg.messages['archived-not-dead'],
  3027. { external_link( ArchiveURL, arch_text, A:ORIGIN('ArchiveURL') ) .. ArchiveFormat, ArchiveDate } );
  3028. if not is_set(OriginalURL) then
  3029. Archived = Archived .. " " .. set_error('archive_missing_url');
  3030. end
  3031. elseif is_set(OriginalURL) then -- DeadURL is empty, 'yes', 'true', 'y', 'unfit', 'usurped'
  3032. local arch_text = cfg.messages['archived-dead'];
  3033. if sepc ~= "." then arch_text = arch_text:lower() end
  3034. if in_array (DeadURL, {'unfit', 'usurped'}) then
  3035. Archived = sepc .. " " .. 'Archived from the original on ' .. ArchiveDate; -- format already styled
  3036. else -- DeadURL is empty, 'yes', 'true', or 'y'
  3037. Archived = sepc .. " " .. substitute( arch_text,
  3038. { external_link( OriginalURL, cfg.messages['original'], OriginalURLorigin ) .. OriginalFormat, ArchiveDate } ); -- format already styled
  3039. end
  3040. else
  3041. local arch_text = cfg.messages['archived-missing'];
  3042. if sepc ~= "." then arch_text = arch_text:lower() end
  3043. Archived = sepc .. " " .. substitute( arch_text,
  3044. { set_error('archive_missing_url'), ArchiveDate } );
  3045. end
  3046. elseif is_set (ArchiveFormat) then
  3047. Archived = ArchiveFormat; -- if set and ArchiveURL not set ArchiveFormat has error message
  3048. else
  3049. Archived = ""
  3050. end
  3051. local Lay = '';
  3052. if is_set(LayURL) then
  3053. if is_set(LayDate) then LayDate = " (" .. LayDate .. ")" end
  3054. if is_set(LaySource) then
  3055. LaySource = " &ndash; ''" .. safe_for_italics(LaySource) .. "''";
  3056. else
  3057. LaySource = "";
  3058. end
  3059. if sepc == '.' then
  3060. Lay = sepc .. " " .. external_link( LayURL, cfg.messages['lay summary'], A:ORIGIN('LayURL') ) .. LayFormat .. LaySource .. LayDate
  3061. else
  3062. Lay = sepc .. " " .. external_link( LayURL, cfg.messages['lay summary']:lower(), A:ORIGIN('LayURL') ) .. LayFormat .. LaySource .. LayDate
  3063. end
  3064. elseif is_set (LayFormat) then -- Test if |lay-format= is given without giving a |lay-url=
  3065. Lay = sepc .. LayFormat; -- if set and LayURL not set, then LayFormat has error message
  3066. end
  3067. if is_set(Transcript) then
  3068. if is_set(TranscriptURL) then
  3069. Transcript = external_link( TranscriptURL, Transcript, TranscriptURLorigin );
  3070. end
  3071. Transcript = sepc .. ' ' .. Transcript .. TranscriptFormat;
  3072. elseif is_set(TranscriptURL) then
  3073. Transcript = external_link( TranscriptURL, nil, TranscriptURLorigin );
  3074. end
  3075. local Publisher;
  3076. if is_set(Periodical) and
  3077. not in_array(config.CitationClass, {"encyclopaedia","web","pressrelease","podcast"}) then
  3078. if is_set(PublisherName) then
  3079. if is_set(PublicationPlace) then
  3080. Publisher = PublicationPlace .. ": " .. PublisherName;
  3081. else
  3082. Publisher = PublisherName;
  3083. end
  3084. elseif is_set(PublicationPlace) then
  3085. Publisher= PublicationPlace;
  3086. else
  3087. Publisher = "";
  3088. end
  3089. if is_set(Publisher) then
  3090. Publisher = " (" .. Publisher .. ")";
  3091. end
  3092. else
  3093. if is_set(PublisherName) then
  3094. if is_set(PublicationPlace) then
  3095. Publisher = sepc .. " " .. PublicationPlace .. ": " .. PublisherName;
  3096. else
  3097. Publisher = sepc .. " " .. PublisherName;
  3098. end
  3099. elseif is_set(PublicationPlace) then
  3100. Publisher= sepc .. " " .. PublicationPlace;
  3101. else
  3102. Publisher = '';
  3103. end
  3104. end
  3105. -- Several of the above rely upon detecting this as nil, so do it last.
  3106. if is_set(Periodical) then
  3107. if is_set(Title) or is_set(TitleNote) then
  3108. Periodical = sepc .. " " .. wrap_style ('italic-title', Periodical)
  3109. else
  3110. Periodical = wrap_style ('italic-title', Periodical)
  3111. end
  3112. end
  3113. --[[
  3114. Handle the oddity that is cite speech. This code overrides whatever may be the value assigned to TitleNote (through |department=) and forces it to be " (Speech)" so that
  3115. the annotation directly follows the |title= parameter value in the citation rather than the |event= parameter value (if provided).
  3116. ]]
  3117. if "speech" == config.CitationClass then -- cite speech only
  3118. TitleNote = " (Speech)"; -- annotate the citation
  3119. if is_set (Periodical) then -- if Periodical, perhaps because of an included |website= or |journal= parameter
  3120. if is_set (Conference) then -- and if |event= is set
  3121. Conference = Conference .. sepc .. " "; -- then add appropriate punctuation to the end of the Conference variable before rendering
  3122. end
  3123. end
  3124. end
  3125. -- Piece all bits together at last. Here, all should be non-nil.
  3126. -- We build things this way because it is more efficient in LUA
  3127. -- not to keep reassigning to the same string variable over and over.
  3128. local tcommon;
  3129. local tcommon2; -- used for book cite when |contributor= is set
  3130. if in_array(config.CitationClass, {"journal","citation"}) and is_set(Periodical) then
  3131. if is_set(Others) then Others = Others .. sepc .. " " end
  3132. tcommon = safe_join( {Others, Title, TitleNote, Conference, Periodical, Format, TitleType, Series,
  3133. Edition, Publisher, Agency}, sepc );
  3134. elseif in_array(config.CitationClass, {"book","citation"}) and not is_set(Periodical) then -- special cases for book cites
  3135. if is_set (Contributors) then -- when we are citing foreword, preface, introduction, etc
  3136. tcommon = safe_join( {Title, TitleNote}, sepc ); -- author and other stuff will come after this and before tcommon2
  3137. tcommon2 = safe_join( {Conference, Periodical, Format, TitleType, Series, Volume, Others, Edition, Publisher, Agency}, sepc );
  3138. else
  3139. tcommon = safe_join( {Title, TitleNote, Conference, Periodical, Format, TitleType, Series, Volume, Others, Edition, Publisher, Agency}, sepc );
  3140. end
  3141. elseif 'map' == config.CitationClass then -- special cases for cite map
  3142. if is_set (Chapter) then -- map in a book; TitleType is part of Chapter
  3143. tcommon = safe_join( {Title, Format, Edition, Scale, Series, Cartography, Others, Publisher, Volume}, sepc );
  3144. elseif is_set (Periodical) then -- map in a periodical
  3145. tcommon = safe_join( {Title, TitleType, Format, Periodical, Scale, Series, Cartography, Others, Publisher, Volume}, sepc );
  3146. else -- a sheet or stand-alone map
  3147. tcommon = safe_join( {Title, TitleType, Format, Edition, Scale, Series, Cartography, Others, Publisher}, sepc );
  3148. end
  3149. elseif 'episode' == config.CitationClass then -- special case for cite episode
  3150. tcommon = safe_join( {Title, TitleNote, TitleType, Series, Transcript, Edition, Publisher}, sepc );
  3151. else -- all other CS1 templates
  3152. tcommon = safe_join( {Title, TitleNote, Conference, Periodical, Format, TitleType, Series,
  3153. Volume, Others, Edition, Publisher, Agency}, sepc );
  3154. end
  3155. if #ID_list > 0 then
  3156. ID_list = safe_join( { sepc .. " ", table.concat( ID_list, sepc .. " " ), ID }, sepc );
  3157. else
  3158. ID_list = ID;
  3159. end
  3160. -- LOCAL
  3161. local xDate = Date
  3162. local pgtext = Position .. Sheet .. Sheets .. Page .. Pages .. At;
  3163. if ( is_set(Periodical) and Date ~= '' and
  3164. not in_array(config.CitationClass, {"encyclopaedia","web"}) )
  3165. or ( in_array(config.CitationClass, {"book","news"}) ) then
  3166. if in_array(config.CitationClass, {"journal","citation"}) and ( Volume ~= '' or Issue ~= '' ) then
  3167. xDate = xDate .. ',' .. Volume
  3168. end
  3169. xDate = xDate .. pgtext
  3170. pgtext = ''
  3171. end
  3172. if PublicationDate and PublicationDate ~= '' then
  3173. xDate = xDate .. ' (' .. PublicationDate .. ')'
  3174. end
  3175. if OrigYear ~= '' then
  3176. xDate = xDate .. OrigYear
  3177. end
  3178. if AccessDate ~= '' then
  3179. xDate = xDate .. ' ' .. AccessDate
  3180. end
  3181. if xDate ~= '' then
  3182. xDate = sepc .. ' ' .. xDate
  3183. end
  3184. -- END LOCAL
  3185. local idcommon = safe_join( { URL, xDate, ID_list, Archived, Via, SubscriptionRequired, Lay, Language, Quote }, sepc );
  3186. local text;
  3187. if is_set(Authors) then
  3188. if is_set(Coauthors) then
  3189. if 'vanc' == NameListFormat then -- separate authors and coauthors with proper name-list-separator
  3190. Authors = Authors .. ', ' .. Coauthors;
  3191. else
  3192. Authors = Authors .. '; ' .. Coauthors;
  3193. end
  3194. end
  3195. Authors = terminate_name_list (Authors, sepc); -- when no date, terminate with 0 or 1 sepc and a space
  3196. if is_set(Editors) then
  3197. local in_text = " ";
  3198. local post_text = "";
  3199. if is_set(Chapter) and 0 == #c then
  3200. in_text = in_text .. cfg.messages['in'] .. " "
  3201. if (sepc ~= '.') then in_text = in_text:lower() end -- lowercase for cs2
  3202. else
  3203. if EditorCount <= 1 then
  3204. post_text = ", " .. cfg.messages['editor'];
  3205. else
  3206. post_text = ", " .. cfg.messages['editors'];
  3207. end
  3208. end
  3209. Editors = terminate_name_list (in_text .. Editors .. post_text, sepc); -- terminate with 0 or 1 sepc and a space
  3210. end
  3211. if is_set (Contributors) then -- book cite and we're citing the intro, preface, etc
  3212. local by_text = sepc .. ' ' .. cfg.messages['by'] .. ' ';
  3213. if (sepc ~= '.') then by_text = by_text:lower() end -- lowercase for cs2
  3214. Authors = by_text .. Authors; -- author follows title so tweak it here
  3215. if is_set (Editors) then -- when Editors make sure that Authors gets terminated
  3216. Authors = terminate_name_list (Authors, sepc); -- terminate with 0 or 1 sepc and a space
  3217. end
  3218. Contributors = terminate_name_list (Contributors, sepc); -- terminate with 0 or 1 sepc and a space
  3219. text = safe_join( {Contributors, Chapter, tcommon, Authors, Place, Editors, tcommon2, pgtext, idcommon }, sepc );
  3220. else
  3221. text = safe_join( {Authors, Chapter, Place, Editors, tcommon, pgtext, idcommon }, sepc );
  3222. end
  3223. elseif is_set(Editors) then
  3224. if EditorCount <= 1 then
  3225. Editors = Editors .. " (" .. cfg.messages['editor'] .. ")" .. sepc .. " "
  3226. else
  3227. Editors = Editors .. " (" .. cfg.messages['editors'] .. ")" .. sepc .. " "
  3228. end
  3229. text = safe_join( {Editors, Chapter, Place, tcommon, pgtext, idcommon}, sepc );
  3230. else
  3231. if config.CitationClass=="journal" and is_set(Periodical) then
  3232. text = safe_join( {Chapter, Place, tcommon, pgtext, idcommon}, sepc );
  3233. else
  3234. text = safe_join( {Chapter, Place, tcommon, pgtext, idcommon}, sepc );
  3235. end
  3236. end
  3237. if is_set(PostScript) and PostScript ~= sepc then
  3238. text = safe_join( {text, sepc}, sepc ); --Deals with italics, spaces, etc.
  3239. text = text:sub(1,-sepc:len()-1);
  3240. end
  3241. text = safe_join( {text, PostScript}, sepc );
  3242. -- Now enclose the whole thing in a <cite/> element
  3243. local options = {};
  3244. if is_set(config.CitationClass) and config.CitationClass ~= "citation" then
  3245. options.class = config.CitationClass;
  3246. options.class = "citation " .. config.CitationClass; -- class=citation required for blue highlight when used with |ref=
  3247. else
  3248. options.class = "citation";
  3249. end
  3250. if is_set(Ref) and Ref:lower() ~= "none" then -- set reference anchor if appropriate
  3251. local id = Ref
  3252. if ('harv' == Ref ) then
  3253. local namelist = {}; -- holds selected contributor, author, editor name list
  3254. -- local year = first_set (Year, anchor_year); -- Year first for legacy citations and for YMD dates that require disambiguation
  3255. local year = first_set ({Year, anchor_year}, 2); -- Year first for legacy citations and for YMD dates that require disambiguation
  3256. if #c > 0 then -- if there is a contributor list
  3257. namelist = c; -- select it
  3258. elseif #a > 0 then -- or an author list
  3259. namelist = a;
  3260. elseif #e > 0 then -- or an editor list
  3261. namelist = e;
  3262. end
  3263. id = anchor_id (namelist, year); -- go make the CITEREF anchor
  3264. end
  3265. options.id = id;
  3266. end
  3267. if string.len(text:gsub("<span[^>/]*>.-</span>", ""):gsub("%b<>","")) <= 2 then
  3268. z.error_categories = {};
  3269. text = set_error('empty_citation');
  3270. z.message_tail = {};
  3271. end
  3272. if is_set(options.id) then
  3273. text = '<cite id="' .. mw.uri.anchorEncode(options.id) ..'" class="' .. mw.text.nowiki(options.class) .. '">' .. text .. "</cite>";
  3274. else
  3275. text = '<cite class="' .. mw.text.nowiki(options.class) .. '">' .. text .. "</cite>";
  3276. end
  3277. local empty_span = '<span style="display:none;">&nbsp;</span>';
  3278. -- Note: Using display: none on the COinS span breaks some clients.
  3279. local OCinS = '<span title="' .. OCinSoutput .. '" class="Z3988">' .. empty_span .. '</span>';
  3280. text = text .. OCinS;
  3281. if #z.message_tail ~= 0 then
  3282. text = text .. " ";
  3283. for i,v in ipairs( z.message_tail ) do
  3284. if is_set(v[1]) then
  3285. if i == #z.message_tail then
  3286. text = text .. error_comment( v[1], v[2] );
  3287. else
  3288. text = text .. error_comment( v[1] .. "; ", v[2] );
  3289. end
  3290. end
  3291. end
  3292. end
  3293. if #z.maintenance_cats ~= 0 then
  3294. text = text .. '<span class="citation-comment" style="display:none; color:#33aa33">';
  3295. for _, v in ipairs( z.maintenance_cats ) do -- append maintenance categories
  3296. text = text .. ' ' .. v .. ' ([[:Category:' .. v ..'|link]])';
  3297. end
  3298. text = text .. '</span>'; -- maintenance mesages (realy just the names of the categories for now)
  3299. end
  3300. no_tracking_cats = no_tracking_cats:lower();
  3301. if in_array(no_tracking_cats, {"", "no", "false", "n"}) then
  3302. for _, v in ipairs( z.error_categories ) do
  3303. text = text .. '[[Category:' .. v ..']]';
  3304. end
  3305. for _, v in ipairs( z.maintenance_cats ) do -- append maintenance categories
  3306. text = text .. '[[Category:' .. v ..']]';
  3307. end
  3308. end
  3309. return text
  3310. end
  3311. --[[--------------------------< H A S _ I N V I S I B L E _ C H A R S >----------------------------------------
  3312. This function searches a parameter's value for nonprintable or invisible characters. The search stops at the first match.
  3313. Sometime after this module is done with rendering a citation, some C0 control characters are replaced with the
  3314. replacement character. That replacement character is not detected by this test though it is visible to readers
  3315. of the rendered citation. This function will detect the replacement character when it is part of the wikisource.
  3316. Output of this function is an error message that identifies the character or the Unicode group that the character
  3317. belongs to along with its position in the parameter value.
  3318. ]]
  3319. --[[
  3320. local function has_invisible_chars (param, v)
  3321. local position = '';
  3322. local i=1;
  3323. while cfg.invisible_chars[i] do
  3324. local char=cfg.invisible_chars[i][1] -- the character or group name
  3325. local pattern=cfg.invisible_chars[i][2] -- the pattern used to find it
  3326. v = mw.text.unstripNoWiki( v ); -- remove nowiki stripmarkers
  3327. position = mw.ustring.find (v, pattern) -- see if the parameter value contains characters that match the pattern
  3328. if position then
  3329. table.insert( z.message_tail, { set_error( 'invisible_char', {char, wrap_style ('parameter', param), position}, true ) } ); -- add error message
  3330. return; -- and done with this parameter
  3331. end
  3332. i=i+1; -- bump our index
  3333. end
  3334. end
  3335. ]]
  3336. --[[--------------------------< Z . C I T A T I O N >----------------------------------------------------------
  3337. This is used by templates such as {{cite book}} to create the actual citation text.
  3338. ]]
  3339. function z.citation(frame)
  3340. local pframe = frame:getParent()
  3341. local validation;
  3342. if nil ~= string.find (frame:getTitle(), 'sandbox', 1, true) then -- did the {{#invoke:}} use sandbox version?
  3343. cfg = mw.loadData ('Module:Citation/CS1/Configuration/sandbox'); -- load sandbox versions of Configuration and Whitelist and ...
  3344. whitelist = mw.loadData ('Module:Citation/CS1/Whitelist/sandbox');
  3345. validation = require ('Module:Citation/CS1/Date_validation/sandbox'); -- ... sandbox version of date validation code
  3346. else -- otherwise
  3347. cfg = mw.loadData ('Module:Citation/CS1/Configuration'); -- load live versions of Configuration and Whitelist and ...
  3348. whitelist = mw.loadData ('Module:Citation/CS1/Whitelist');
  3349. validation = require ('Module:Citation/CS1/Date_validation'); -- ... live version of date validation code
  3350. end
  3351. dates = validation.dates; -- imported functions
  3352. year_date_check = validation.year_date_check;
  3353. local args = {};
  3354. local suggestions = {};
  3355. local error_text, error_state;
  3356. local config = {};
  3357. for k, v in pairs( frame.args ) do
  3358. config[k] = v;
  3359. args[k] = v;
  3360. end
  3361. local capture; -- the single supported capture when matching unknown parameters using patterns
  3362. for k, v in pairs( pframe.args ) do
  3363. if v ~= '' then
  3364. if not validate( k ) then
  3365. error_text = "";
  3366. if type( k ) ~= 'string' then
  3367. -- Exclude empty numbered parameters
  3368. if v:match("%S+") ~= nil then
  3369. error_text, error_state = set_error( 'text_ignored', {v}, true );
  3370. end
  3371. elseif validate( k:lower() ) then
  3372. error_text, error_state = set_error( 'parameter_ignored_suggest', {k, k:lower()}, true );
  3373. else
  3374. if nil == suggestions.suggestions then -- if this table is nil then we need to load it
  3375. if nil ~= string.find (frame:getTitle(), 'sandbox', 1, true) then -- did the {{#invoke:}} use sandbox version?
  3376. suggestions = mw.loadData( 'Module:Citation/CS1/Suggestions/sandbox' ); -- use the sandbox version
  3377. else
  3378. suggestions = mw.loadData( 'Module:Citation/CS1/Suggestions' ); -- use the live version
  3379. end
  3380. end
  3381. for pattern, param in pairs (suggestions.patterns) do -- loop through the patterns to see if we can suggest a proper parameter
  3382. capture = k:match (pattern); -- the whole match if no caputre in pattern else the capture if a match
  3383. if capture then -- if the pattern matches
  3384. param = substitute( param, capture ); -- add the capture to the suggested parameter (typically the enumerator)
  3385. error_text, error_state = set_error( 'parameter_ignored_suggest', {k, param}, true ); -- set the error message
  3386. end
  3387. end
  3388. if not is_set (error_text) then -- couldn't match with a pattern, is there an expicit suggestion?
  3389. if suggestions.suggestions[ k:lower() ] ~= nil then
  3390. error_text, error_state = set_error( 'parameter_ignored_suggest', {k, suggestions.suggestions[ k:lower() ]}, true );
  3391. else
  3392. error_text, error_state = set_error( 'parameter_ignored', {k}, true );
  3393. end
  3394. end
  3395. end
  3396. if error_text ~= '' then
  3397. table.insert( z.message_tail, {error_text, error_state} );
  3398. end
  3399. end
  3400. args[k] = v;
  3401. elseif args[k] ~= nil or (k == 'postscript') then
  3402. args[k] = v;
  3403. end
  3404. end
  3405. for k, v in pairs( args ) do
  3406. if 'string' == type (k) then -- don't evaluate positional parameters
  3407. has_invisible_chars (k, v);
  3408. end
  3409. end
  3410. return citation0( config, args)
  3411. end
  3412. return z