User:Pat/sort

Tables can be made sortable via client-side JavaScript with. This works in MediaWiki 1.9, which is installed in all Wikimedia projects. Sortable tables are identified by the arrows in each of its header cells. Clicking them will cause the table rows to sort based on the selected column, in ascending order first, and subsequently toggling between ascending and descending order. Links and other wiki-markup are not possible in headers.

Note that all of the below is subject to change due to improvements in the script.

Javascript
The JavaScript code wikibits.js has on each site a copy at /skins-1.5/common/wikibits.js, on this site /skins-1.5/common/wikibits.js. In addition a site may have a page MediaWiki:Common.js which adds and overrides some code. The description below is in the process of being adapted to the version on Meta,. The sorting code in it can be copied to other sites (by sysops of these other sites).

Sorting modes
The sorting modes (the data types, which, in addition to the choice "ascending" or "descending", determine the sorting order) are:
 * string
 * criterion: the first non-blank element is not of type numeric, date or currency;
 * order: after conversion of capitals to lowercase the order is ASCII - partial list showing the order: !"#$%&'*+,-./09:;<=>?@[\]^_'az{|}~é&mdash; (see also below; a blank space comes before every other character; an nbsp code counts as a space; two adjacent ordinary blank spaces count as one; for multiple blank spaces one can use nbsps or alternate nbsps and ordinary blank spaces)
 * numeric
 * criterion: the first non-blank element consists of just digits, points, commas, spaces, +, -, possibly followed by e or E and digits
 * order: if the string starts with a number (where spaces and nbsp's at the start are ignored) the order is numeric according to the first number in the string (parseFloat is applied) after removing the commas, if any; if it does not (parseFloat returns NaN), the element is positioned like 0
 * proposed improvement: ignore spaces in evaluating numbers to determine the sorting order
 * proposed internationalisation: in German etc., treat comma as a decimal point


 * date (see also below)
 * criterion: the first non-blank element is of the form "dd-dd-dddd", "dd-dd-dd", or "dd aaa dddd"
 * order: the string abcdefghij of length 10 is positioned as ghijdeab, the string abcdefghijk of length 8 as 19ghdeab if gh>=50 (string comparison) and 20ghdeab otherwise (i.e., the assumed format is DD-MM-YYYY or DD-MM-YY), and the string "dd aaa dddd" with aaa an abbreviated month name: chronologically
 * currency
 * criterion: the first non-blank element starts with $, £, €, or ¥
 * order: numeric, ignoring all characters except digits and points

The sorting mode is determined by the table element that is currently in the first non-blank row below the header. Thus it may change after sorting, which can give a cycle of four or even more instead of two.

Examples
Text after a number (e.g. a footnote) does not affect the sorting order, if the sorting mode is numeric. However, if the number at the top has text after it, this makes the sorting mode alphabetic.

The example with "a" gives alphabetic sorting; that with "e" ditto, the data are not mistaken for numbers in scientific format.

The first example demonstrates that text is positioned at zero, and that e.g. e3 for 1000 is not allowed, use 1e3 instead. The second example shows that expressions are not sorted according to their evaluated value, but according to the first number.

Alphabetic sorting with hidden sortkey
If necessary one can apply alphabetic sorting using a sortkey which due to CSS is not displayed:

(However, on some projects, notably Ontoworld, a page with this wikitext cannot be saved, as spam protection.)

Javascript sorting is based on the text inside and outside the tags, without the tags themselves. The sortkey comes at the start and is separated from the displayed text in such a way that the latter does not affect the sorting order. For example, if a sortkey system is used where there are no blank spaces in any sortkey, then a blank space can be used for separation. If a single blank space is possible in a sortkey, two nbsps can be used. For table elements for which the text to be displayed is equal to the sortkey, no duplication is needed, of course.

If the text inside and outside the tags together is of a form that would cause a sorting mode other than alphabetic (if and when the element is at the top), a character can be appended at the end of the sortkey to avoid this, again making sure it does not affect the sorting order by putting a space or two nbsps. This can be dispensed with if the element can never be at the top, but this can be complicated to assess as that can be caused by sorting other columns, with varying sorting modes, and it can change when deleting a row, adding a column, etc.

Instead of "display=none" another way is using a font color equal to the background, e.g. 999 gives "999 ". With this method the hidden code can be seen in selected text (e.g. with the mouse). Also the hidden text is included when copying the rendered text. The first may be an advantage or a disadvantage, the second seems only a disadvantage. A complication is also that if a user uses a background color different from the default, the specified text color may not match it; to make sure they are the same the background color can be specified also.

Unsuitability of padding with no-break spaces
The effect of left-padding with "&amp;nbsp;" codes, which render as blank spaces, depends on the browser: in IE they are (unlike actual blank spaces) counted for sorting as leading blank spaces, so in a list of numbers with text (for which the alphabetic sorting mode applies) they could be used to equalize the number of characters before the explicit or implicit decimal separator. However, in Firefox they are ignored for the purpose of sorting.

See also w:Talk:List_of_U.S._states_by_population.

Padding with zeros
Example:
 * 000000

Formatnum can be combined with padleft:

Integer:

299,792,458 gives:


 * 299,792,458

Real:

0.000000 gives:


 * 0.000000

Alphabetic sortkey for numeric sorting
In some cases it is not possible to use numeric sorting mode:
 * the numbers are preceded by some text other than a currency symbol
 * some elements in a column, possibly also the first, are not numbers

In this case one may want to construct a hidden alphabetic sortkey for numeric sorting. This can be done for all numbers between -1e100 and 1e100 in arbitrary precision as follows: In the following the left column shows the code for alphabetic sorting, where cryptic followed by the regular notation. The second column contains the same (hence sorting the same), but with code hidden with CSS. The third column shows the corresponding plain numbers with thousands separators, equal to what the second column shows, now using numeric sorting mode.
 * where scientific notation is used, it is normalized such that the absolute value of the mantissa is between 1 and 10; the exponent is put first
 * scientific notation is used for all negative numbers, and all positive numbers outside some interval (below: 1e-9 to 1e9), and not inside that interval
 * where the absolute value of the exponent and/or the mantissa is a decreasing function of the number, the notation uses its complement with respect to 99 for exponents and 10 for mantissas; the code "c" is added in these cases
 * numbers 0 &le; x < 1000 get a "+" in front
 * positive numbers in scientific notation with a negative exponent get "+0" in front
 * spaces inside the code and &-signs in front are added where needed:
 * for numbers not in scientific notation the positions of all explicit and implicit decimal points are aligned
 * for the starting position, i.e. the position of the first "-", "+", or "e", of other numbers, see the example table
 * no code should satisfy the criterion for numeric sorting mode (below we have always either an ampersand or two letters e): although this matters only for the element at the top, any element might arrive at the top due to sorting another column

Dates
Example: ([ edit] to view source) For dates, the sorting mode is based on the rendered date format. Unfortunately, none of the standard formats for the Mediawiki's date-formatting feature match either of the formats for the "date" sorting mode. Thus, if dates are entered in one of those standard formats, the sorting mode would be "string"; only dates formated as  will result in true chronological sorting.

However, like above we can put a sortkey in front which, due to CSS, is not displayed. With a hidden sortkey one can simply use the non-wikilinked format  for years AD followed by any choice of displayable text, including Mediawiki date formatting. The Wikipedia template provides a convenient way of applying this method while using the date-formatting feature for display.

For years BC we can use, for example,  for -0062-09-23 (subtract the year number BC from 10000, or the absolute value of the astronomical year from 9999).

If a table column contains any or all incomplete dates, this will not cause sorting problems. If only a year and month are given, that incomplete date is positioned alphabetically before the first day of the month in question. Likewise, if only a year is given, the date is positioned before the first month or day given for that year.

If at some point (i.e., after possible previous sorting) the form  is at the top with a non-negative year, sorting would be numerical; in this case, after toggling between ascending and descending there would be no proper sorting within each year (because parsefloat is applied, finding the first number in the string, and basing sorting on only that number). Also, years BC would not be sorted properly. Therefore, alphabetic sorting has to be enforced. This can be done by putting a non-displayed character after the year, separated by a space.

See also:
 * bugzilla:8226.
 * bugzilla:8226.

Limitations
Javascript sorting may not work properly on tables with cells extending over multiple rows and/or columns. In some cases the table gets messed up when attempting to sort, in other cases some of the sorting buttons work while others don't.

Sorting the wikitext of a table
Unfortunately it does not seem possible to directly and automatically sort the wikitext itself, according to one of the sortkeys. This would, after saving, directly produce a table sorted as required.

However, if for a given table, we make an auxiliary sortable table rendering as wikitext for the original table, we can sort the wikitext of the original table.

Example:

Original table:

Auxiliary table:

After copying the rendered text to the edit box, and deleting the header line, this renders as:

Alphabetic sorting order
The two-character entries such as A1 demonstrate that A and a are at the same position.