ವಿಕಿಪೀಡಿಯ:Lua requests
WP:Lua Project | WT:Lua Project talk | Modules | Help | To do | Requests | Resources en: m: mw: external |
Lua scripts on Wikipedia are similar to templates but useful for performing more complex tasks for which templates are too complex or slow. Common examples include numeric computations, string manipulation and parsing, and decision trees. You can use this page to request help from Lua developers in writing a Lua module for a specific task on Wikipedia or another Wikimedia Foundation project. Both debugging help and full implementation are available.
To start a request, just make a new section below and describe what you need. You may wish to first check Special:PrefixIndex/Module: to see if you can find a suitable existing script.
It may help developers to provide examples of where the task is likely to be useful within Wikipedia. If the proposal would replace or improve upon existing templates, please note which ones.
External links
[ಬದಲಾಯಿಸಿ]At the small Wikipedias it's hard to find Lua experts so I hope you don't mind if I ask here. I would like to correct the external links (Identificants) in this article:
The article is based on:
At Lua template no. 2 there must be an error but I can't find it. --Marc Alioria (talk) 01:56, 7 March 2014 (UTC)
- At en.wiki, "external link" means something like http://example.com in an "External links" section at the end of some articles. I think you are referring to the "Q851164" that appears in the article, which should be some text extracted from Wikidata. As far as I can tell, oc:Modèl:Wikidata is a template which invokes oc:Module:Wikidata, and the only other module involved on the article linked above is oc:Module:Wikidata/formatatge. Is "Escolaritat Q851164" the problem you mean? Are there any pages with the Wikidata template which work? The #2 item above is a template which does not directly use Lua, although it might be calling the Wikidata module via its template. Johnuniq (talk) 02:46, 7 March 2014 (UTC)
- With external links I've meant the section "Identificants": VIAF http://13.219.148.195/. The correct link would be: https://viaf.org/viaf/232494275/. --Marc Alioria (talk) 03:00, 7 March 2014 (UTC)
- I've found what I think is the problem: this edit removed the
{{{3}}}
parameter from oc:Modèl:Linha Wikidata extèrne, which is being used by oc:Modèl:Infobox identificacions autoritats. I'll double-check that that's the only thing broken, and fix it if everything looks ok. — Mr. Stradivarius ♪ talk ♪ 03:25, 7 March 2014 (UTC)- Yep, that looks like the problem - after expanding the templates, those external links show up as wikitext like
[http://1020032421 1020032421]
, which is being interpreted as an IP address by Chrome and probably other browsers too. If you add the missing URL portion, it turns into a valid link about the article's subject: 1020032421. — Mr. Stradivarius ♪ talk ♪ 03:38, 7 March 2014 (UTC)- Good! I was getting lost. Johnuniq (talk) 03:45, 7 March 2014 (UTC)
- And now fixed here. In the end, it didn't have anything to do with Lua. :) — Mr. Stradivarius ♪ talk ♪ 03:50, 7 March 2014 (UTC)
- Good! I was getting lost. Johnuniq (talk) 03:45, 7 March 2014 (UTC)
- Yep, that looks like the problem - after expanding the templates, those external links show up as wikitext like
- I've found what I think is the problem: this edit removed the
- With external links I've meant the section "Identificants": VIAF http://13.219.148.195/. The correct link would be: https://viaf.org/viaf/232494275/. --Marc Alioria (talk) 03:00, 7 March 2014 (UTC)
Function needed
[ಬದಲಾಯಿಸಿ]I need a small function isList() that returns true if a string is a wikilist, simply by checking if the first character is "*", "#", ";" or ":". — Edokter (talk) — 13:16, 7 March 2014 (UTC)
- i believe something like
{{#invoke:String|find|source = <YOUR STRING>|target = ^[*#;:]|plain=false}}
- should work, no? note that using "source=" means your string will be trimmed - if you want to allow for spaces (i.e., "<Space>#bla" will fail while "#bla" succeeds), drop the "source =" part: "your string" should be the first param after "find". this will return 0 if your string is not a list, and 1 if it is.
- if you meant for longer string that might contain newlines, then the request should be defined more precisely - what is expected to happen when some of the lines begin with one of [*#:;] but not all? (examples below)
- {{#invoke:String | find | source = b#la | target = ^[*#;:] | plain = false}} => 0
- {{#invoke:String | find |source = #bla | target = ^[*#;:] | plain = false }} => 1
- peace - קיפודנחש (aka kipod) (talk) 17:01, 7 March 2014 (UTC)
- Using string.match() inside Lua module: Hence, inside a Lua module, the functionality would be as follows:
if string.match(str,'^[*#;:]') ~= nil
then hey = "There is a list item in str."
end
- Note the regex has been set to match the start-of-line by caret "^" and would not yet match text with a leading space (such as " :"); however, to also match with leading spaces, then include regex " *" to match zero-or-more spaces, as: string.match(str,'^ *[*#;:]') to match list items even with lead spaces. -Wikid77 (talk) 04:21, 9 March 2014, 04:16, 10 March 2014 (UTC)
- for the benefit of future generations who might look in the archive, i want to note that the last example, namely
string.match(str,'^ *[*#;:]')
, is not only useless, it's positively wrong: the "zero or more" quantifier, "*", applies only to the single character preceding it (semicolon), and not to the whole "no-break" marker, " ", as Wikid77 seems to imply. (note: i injected a hidden directional character so the non-breaking-space will be visible when reading the page). "real" regex supports grouping (using parenthesis), which would make this possible, but lua pattern matching does not support it. peace. קיפודנחש (aka kipod) (talk) 00:49, 10 March 2014 (UTC)- Well, I had used ' ' for typesetting the text, not as a literal ampersand-n-b-s-p as part of a regex pattern, but I have re-typeset those by using nowrap text; sorry for the confusion. So, the example I gave is not only quite useful, it's positively correct (I have 2 extensive university degrees in computer science, so that is why I was instantly familiar with the regex issues). -Wikid77 04:16, 10 March 2014 (UTC)
- @Wikid77: if it was meant for *real* whitespace rather than for non-breaking space, then your snippet may be technically correct, but it's lacking, and, of course, completely useless: lines beginning with spaces are parsed as "pre" elements and not as lists, even if the first character after the space is "*":
- Well, I had used ' ' for typesetting the text, not as a literal ampersand-n-b-s-p as part of a regex pattern, but I have re-typeset those by using nowrap text; sorry for the confusion. So, the example I gave is not only quite useful, it's positively correct (I have 2 extensive university degrees in computer science, so that is why I was instantly familiar with the regex issues). -Wikid77 04:16, 10 March 2014 (UTC)
- for the benefit of future generations who might look in the archive, i want to note that the last example, namely
* this is not a list element
- beyond its uselessness, it's "lacking", because when one wants to match whitespaces, one should use %s rather than <Space>. this will match tabs and other whitespaces also. peace - קיפודנחש (aka kipod) (talk) 13:26, 10 March 2014 (UTC)
- List items preceded by spaces: At first I thought you were joking, but you actually seem to believe what you are saying. Instead, list items in Wikipedia can have leading spaces, such as in wp:parser functions (#ifexpr, lc, etc.). For example:
- 1. {{uc: ::::::* Bullet line}} =
- BULLET LINE
- 2. {{#ifexpr: 1 = 2-1 | {{{x| :::::::* bullet}}} }} =
- bullet
- 1. {{uc: ::::::* Bullet line}} =
- Hence, for the way Wikipedia actually works, the regex pattern I gave is actually quite useful to match a list item with leading spaces. However, some whitespace characters literalize the list-items, such as " " which cannot precede an asterisk bullet. It is important to test notions in real markup, rather than imagine something as being "useless" in some imaginary world. -Wikid77 22:01, 10 March 2014 (UTC)
- But then again, parser functions and templates passed to Lua will have already been expanded, so for Lua scripts it does make sense to assume that wikitext lists must start at the start of a line. — Mr. Stradivarius on tour ♪ talk ♪ 22:52, 10 March 2014 (UTC)
- Well, you could assume parameters have no leading spaces, but it would be wrong; see below: "#Parser keeps lead spaces in unnamed parameters". -Wikid77 17:51, 11 March 2014 (UTC)
- But then again, parser functions and templates passed to Lua will have already been expanded, so for Lua scripts it does make sense to assume that wikitext lists must start at the start of a line. — Mr. Stradivarius on tour ♪ talk ♪ 22:52, 10 March 2014 (UTC)
- (ec) Parser functions strips whitespace (so do template parameters for that fact). The only thing I am interested in is the first character being
[*#;:]
. — Edokter (talk) — 22:54, 10 March 2014 (UTC)- Only named parameters strip outer spaces (and the whitespace tabs): "#Parser keeps lead spaces in unnamed parameters". -Wikid77 17:51, 11 March 2014 (UTC)
- (ec) Parser functions strips whitespace (so do template parameters for that fact). The only thing I am interested in is the first character being
Here's my effort:
local function isList(s)
-- Returns true if s starts with the characters "*", "#", ";" or ":".
-- Otherwise returns false. Will produce an error if s is not a string.
if s:find('^[*#:;]') then
return true
else
return false
end
end
That's all you need to detect strings starting with [*#;:]
. — Mr. Stradivarius ♪ talk ♪ 23:28, 10 March 2014 (UTC)
- And thinking about it, you could do that in even less code:
local function isList(s)
-- Returns true if s starts with the characters "*", "#", ";" or ":".
-- Otherwise returns false. Will produce an error if s is not a string.
return s:find('^[*#:;]') ~= nil
end
- Not sure if there's any merit in doing that over the one above, though. — Mr. Stradivarius ♪ talk ♪ 23:34, 10 March 2014 (UTC)
- Nice and short. What happens if an error is raised? — Edokter (talk) — 23:38, 10 March 2014 (UTC)
- If an error is raised when running from #invoke, you get the lovely big red "script error" message. :) To avoid that, you have three basic options. The first is to make sure that you will only pass strings to the function. This is actually not so hard, as argument values passed from #invoke are always strings. (Argument keys can be numbers, however.) The second is to check the type of s before doing the matching. You can make it return nil if s is not a string, and true or false otherwise. That works like this:
- Nice and short. What happens if an error is raised? — Edokter (talk) — 23:38, 10 March 2014 (UTC)
local function isList(s)
-- Returns true if s starts with the characters "*", "#", ";" or ":".
-- Otherwise returns false. Will return nil if s is not a string.
if type(s) ~= 'string' then
return nil
end
return s:find('^[*#:;]') ~= nil
end
- Or this:
local function isList(s)
-- Returns true if s starts with the characters "*", "#", ";" or ":".
-- Otherwise returns false. Will return nil if s is not a string.
if type(s) == 'string' then
return s:find('^[*#:;]') ~= nil
else
return nil
end
end
- In a similar vein, you can also convert s to a string before matching it, by using
s = tostring(s)
. But type checking is more elegant. The third is to try and catch the error by using pcall - although I think just checking the type of s is neater. — Mr. Stradivarius ♪ talk ♪ 23:59, 10 March 2014 (UTC)- about "neatness": it is very common (and neat, if you ask me) in lua to use boolean shortcut for this purpose exactly: so instead of if..then..else..end, one can simply write this will return either false (not a string), 'nil' (not a list) or the first character. usually this is good enough, but if you prefer "true" rather than the found character, use
return type(s) == 'string' and s:find('^[*#:;]')
this pattern is very common in lua, it makes the code shorter and more concise, and getting familiar with it is helpful. i use it all the time - maybe sometimes to a fault... peace - קיפודנחש (aka kipod) (talk) 23:23, 11 March 2014 (UTC)return type(s) == 'string' and s:find('^[*#:;]') and true
- about "neatness": it is very common (and neat, if you ask me) in lua to use boolean shortcut for this purpose exactly: so instead of if..then..else..end, one can simply write
- In a similar vein, you can also convert s to a string before matching it, by using
Parser keeps lead spaces in unnamed parameters
[ಬದಲಾಯಿಸಿ]Note how the unnamed parameters retain any leading spaces/tabs, when invoking a Lua function:
- f1. {#invoke:String|find|*bullet|target=^[*#:;]|plain=false}} = 1
- f2. {#invoke:String|find| *bullet|target=^[*#:;]|plain=false}} = 0
- f3. {#invoke:String|find|source= *bullet|target=^[*#:;]|plain=false}} = 1
- f4. {#invoke:String|find| *bullet|target=^%s*[*#:;]|plain=false}} = 1
In case f2, the find() is unable to match with the lead-spaces and gives result "0" as a no-match, but in case f3, with the named parameter ("source="), the parser has omitted the outer spaces/tabs. To handle extra spaces/tabs, before a list-item, the regex pattern in case f4 uses "^%s*" to allow multiple spaces/tabs to precede a list-item, such as in " *bullet" with 3 extra spaces. -Wikid77 (talk) 17:51, 11 March 2014 (UTC)
Pattern documentation
[ಬದಲಾಯಿಸಿ]This is the Lua pattern documentation: mw:Extension:Scribunto/Lua_reference_manual#Introduction, and [೧]. Is there a pattern documentation that does not say "like the regex we all know, except ..."? I need a 101 documentation, so in chapters 1 to 4 the word "greedy" does not appear. If such a documentation exists for regex, that would be great too. -DePiep (talk) 14:50, 7 March 2014 (UTC)
- Start with article "Regex" and remember percent '%' for literals: Because there are so many features, it would be best to tell people to read page "Regex" but beware how Lua uses percent sign '%' as the literalizing character (not standard "\"). Hence to match square brackets in a string, each bracket must be escaped as '%[' and '%]' because regex brackets specify a single-character set. Compare examples:
- 1. {#invoke:String |find|source= aabbcc |target= [.*] |plain=false}} = 0
- 2. {#invoke:String |find|source= a[ab]bcc |target= \[.*\] |plain=false}} = 0
- 3. {#invoke:String |find|source= a[ab]bcc |target= %[.*%] |plain=false}} = 2
- 4. {#invoke:String |find|source= aab[bc]c |target= %[.*%] |plain=false}} = 4
- Note how the brackets only matched (in lines 3 & 4) once they were escaped as Lua '%[' and '%]' and that is bizarre, compared to the typical POSIX backslash format, as '\[' and '\]'. Page "Regex" contains an extensive explanation of the major regex features, with several complex examples to demonstrate the potential power of using regex patterns. Unfortunately, a "Regex 101" guide will require learning the numerous regex patterns, just as "Greek 101" requires learning the 24-letter Greek Alphabet, in alphabetical order; otherwise a student cannot lookup words in a Greek/English dictionary unless knowing the sequence of Greek letters: α-β-γ-δ-ε, etc. Regex is similar to fair-use restrictions for copyrighted images; a student should plan at least 3-5 days of study to understand the major concepts and restrictions in usage.
FYI: It is crucial for students to understand how regex matches are "greedy" and match the longest possible string, as compared to typical human left-to-right, find-the-first-match. In fact, regex's greedy pattern-matching is likely one of the worst design flaws in the history of computer science, and has been a royal pain for over 35 years, requiring new symbols to allow "non-greedy" matches. I would rank greedy matches the same as "implicit declaration" of misspelled variable names, as likely to cause horrific bizarre results. -Wikid77 (talk) 05:36, 9 March 2014 (UTC)
- @DePiep: Try the lua-users wiki. If that's hard to understand, post back here with whatever you're having trouble with, and someone should be able to help you out. Good luck. :) — Mr. Stradivarius ♪ talk ♪ 08:06, 9 March 2014 (UTC)
- Also, for regex, I found regular-expressions.info useful. — Mr. Stradivarius ♪ talk ♪ 21:44, 9 March 2014 (UTC)
- I don't understand the complaint about greedy pattern matching. If I do {{#invoke:LuaCall|main|x=abaabaaab|y=b(.*)b|string.match(x,y)}} -> aabaaa, sure, that's greedy. But I can do {{#invoke:LuaCall|main|x=abaabaaab|y=b(.-)b|string.match(x,y)}} -> aa just as easily. And the greediness still uses a left to right search; it just doesn't stop until it has to. But it is annoying that the Lua format doesn't accept a number of the Javascript options, so that it's not easily possible to reuse Javascript patterns here. Wnt (talk) 03:21, 13 March 2014 (UTC)
- I tried redigesting the Scribunto manual at [೨] - I'm not sure I made anything clearer or in any way better though. Wnt (talk) 03:04, 14 March 2014 (UTC)
- I'm going to reiterate Module talk:LuaCall#I sincerely hope that no one ever actually uses this: I really don't think using that is something we should encourage use of. Anomie⚔ 10:31, 14 March 2014 (UTC)
- I don't understand the criticism. First, to be honest, I don't understand what load(func, chunk) is actually used for. And certainly I don't see any comparison to obfuscated C. What I know is that I find myself making use of this fairly often - most commonly for a string.len measurement of a DYK-related item. I assume it would be legitimate to write a dedicated module for purposes like this, but why bother having a separate module for each one? To be clear, I do understand that going back to LuaCall repeatedly within a single template would quickly become inefficient. Wnt (talk) 14:24, 14 March 2014 (UTC)
- Wnt, I guess you refer to my complaint about greedy. I meant to say that, when starting Lua patterns (or Regex for that matter), explaining "greedy" should not be the first topic. And so, a little hyperbolic, it should be "not in chapter 1 to 4". That is because when I want to learn Patterns from scratch, that is not the essence. To illustrate this with the demo you gave yourself: I first need to know what the meaningful symbols are you use (that is
*().
). Now please don't explain them to me here, it's just to show that I need to learn that first. -DePiep (talk) 14:41, 14 March 2014 (UTC) - Wnt, sounds like you're looking for Module:String#len? — Mr. Stradivarius ♪ talk ♪ 15:24, 14 March 2014 (UTC)
- Wnt, I guess you refer to my complaint about greedy. I meant to say that, when starting Lua patterns (or Regex for that matter), explaining "greedy" should not be the first topic. And so, a little hyperbolic, it should be "not in chapter 1 to 4". That is because when I want to learn Patterns from scratch, that is not the essence. To illustrate this with the demo you gave yourself: I first need to know what the meaningful symbols are you use (that is
- I don't understand the criticism. First, to be honest, I don't understand what load(func, chunk) is actually used for. And certainly I don't see any comparison to obfuscated C. What I know is that I find myself making use of this fairly often - most commonly for a string.len measurement of a DYK-related item. I assume it would be legitimate to write a dedicated module for purposes like this, but why bother having a separate module for each one? To be clear, I do understand that going back to LuaCall repeatedly within a single template would quickly become inefficient. Wnt (talk) 14:24, 14 March 2014 (UTC)
- I'm going to reiterate Module talk:LuaCall#I sincerely hope that no one ever actually uses this: I really don't think using that is something we should encourage use of. Anomie⚔ 10:31, 14 March 2014 (UTC)