WebSVN – Moodle – Autoría – /lib/htmlpurifier/HTMLPurifier/AttrDef/CSS/FontFamily.php

Rev	Autor	Línea Nro.	Línea
1	efrain	1	`<?php`
		2
		3	`/**`
		4	`* Validates a font family list according to CSS spec`
		5	`*/`
		6	`class HTMLPurifier_AttrDef_CSS_FontFamily extends HTMLPurifier_AttrDef`
		7	`{`
		8
		9	`protected $mask = null;`
		10
		11	`public function __construct()`
		12	`{`
		13	`// Lowercase letters`
		14	`$l = range('a', 'z');`
		15	`// Uppercase letters`
		16	`$u = range('A', 'Z');`
		17	`// Digits`
		18	`$d = range('0', '9');`
		19	`// Special bytes used by UTF-8`
		20	`$b = array_map('chr', range(0x80, 0xFF));`
		21	`// All valid characters for the mask`
		22	`$c = array_merge($l, $u, $d, $b);`
		23	`// Concatenate all valid characters into a string`
		24	`// Use '_- ' as an initial value`
		25	`$this->mask = array_reduce($c, function ($carry, $value) {`
		26	`return $carry . $value;`
		27	`}, '_- ');`
		28
		29	`/*`
		30	`PHP's internal strcspn implementation is`
		31	`O(length of string * length of mask), making it inefficient`
		32	`for large masks. However, it's still faster than`
		33	`preg_match 8)`
		34	`for (p = s1;;) {`
		35	`spanp = s2;`
		36	`do {`
		37	`if (*spanp == c \|\| p == s1_end) {`
		38	`return p - s1;`
		39	`}`
		40	`} while (spanp++ < (s2_end - 1));`
		41	`c = *++p;`
		42	`}`
		43	`*/`
		44	`// possible optimization: invert the mask.`
		45	`}`
		46
		47	`/**`
		48	`* @param string $string`
		49	`* @param HTMLPurifier_Config $config`
		50	`* @param HTMLPurifier_Context $context`
		51	`* @return bool\|string`
		52	`*/`
		53	`public function validate($string, $config, $context)`
		54	`{`
		55	`static $generic_names = array(`
		56	`'serif' => true,`
		57	`'sans-serif' => true,`
		58	`'monospace' => true,`
		59	`'fantasy' => true,`
		60	`'cursive' => true`
		61	`);`
		62	`$allowed_fonts = $config->get('CSS.AllowedFonts');`
		63
		64	`// assume that no font names contain commas in them`
		65	`$fonts = explode(',', $string);`
		66	`$final = '';`
		67	`foreach ($fonts as $font) {`
		68	`$font = trim($font);`
		69	`if ($font === '') {`
		70	`continue;`
		71	`}`
		72	`// match a generic name`
		73	`if (isset($generic_names[$font])) {`
		74	`if ($allowed_fonts === null \|\| isset($allowed_fonts[$font])) {`
		75	`$final .= $font . ', ';`
		76	`}`
		77	`continue;`
		78	`}`
		79	`// match a quoted name`
		80	`if ($font[0] === '"' \|\| $font[0] === "'") {`
		81	`$length = strlen($font);`
		82	`if ($length <= 2) {`
		83	`continue;`
		84	`}`
		85	`$quote = $font[0];`
		86	`if ($font[$length - 1] !== $quote) {`
		87	`continue;`
		88	`}`
		89	`$font = substr($font, 1, $length - 2);`
		90	`}`
		91
		92	`$font = $this->expandCSSEscape($font);`
		93
		94	`// $font is a pure representation of the font name`
		95
		96	`if ($allowed_fonts !== null && !isset($allowed_fonts[$font])) {`
		97	`continue;`
		98	`}`
		99
		100	`if (ctype_alnum($font) && $font !== '') {`
		101	`// very simple font, allow it in unharmed`
		102	`$final .= $font . ', ';`
		103	`continue;`
		104	`}`
		105
		106	`// bugger out on whitespace. form feed (0C) really`
		107	`// shouldn't show up regardless`
		108	`$font = str_replace(array("\n", "\t", "\r", "\x0C"), ' ', $font);`
		109
		110	`// Here, there are various classes of characters which need`
		111	`// to be treated differently:`
		112	`// - Alphanumeric characters are essentially safe. We`
		113	`// handled these above.`
		114	`// - Spaces require quoting, though most parsers will do`
		115	`// the right thing if there aren't any characters that`
		116	`// can be misinterpreted`
		117	`// - Dashes rarely occur, but they fairly unproblematic`
		118	`// for parsing/rendering purposes.`
		119	`// The above characters cover the majority of Western font`
		120	`// names.`
		121	`// - Arbitrary Unicode characters not in ASCII. Because`
		122	`// most parsers give little thought to Unicode, treatment`
		123	`// of these codepoints is basically uniform, even for`
		124	`// punctuation-like codepoints. These characters can`
		125	`// show up in non-Western pages and are supported by most`
		126	`// major browsers, for example: "ＭＳ明朝" is a`
		127	`// legitimate font-name`
		128	`// <http://ja.wikipedia.org/wiki/MS_明朝>. See`
		129	`// the CSS3 spec for more examples:`
		130	`// <http://www.w3.org/TR/2011/WD-css3-fonts-20110324/localizedfamilynames.png>`
		131	`// You can see live samples of these on the Internet:`
		132	`// <http://www.google.co.jp/search?q=font-family+ＭＳ+明朝\|ゴシック>`
		133	`// However, most of these fonts have ASCII equivalents:`
		134	`// for example, 'MS Mincho', and it's considered`
		135	`// professional to use ASCII font names instead of`
		136	`// Unicode font names. Thanks Takeshi Terada for`
		137	`// providing this information.`
		138	`// The following characters, to my knowledge, have not been`
		139	`// used to name font names.`
		140	`// - Single quote. While theoretically you might find a`
		141	`// font name that has a single quote in its name (serving`
		142	`// as an apostrophe, e.g. Dave's Scribble), I haven't`
		143	`// been able to find any actual examples of this.`
		144	`// Internet Explorer's cssText translation (which I`
		145	`// believe is invoked by innerHTML) normalizes any`
		146	`// quoting to single quotes, and fails to escape single`
		147	`// quotes. (Note that this is not IE's behavior for all`
		148	`// CSS properties, just some sort of special casing for`
		149	`// font-family). So a single quote cannot be used`
		150	`// safely in the font-family context if there will be an`
		151	`// innerHTML/cssText translation. Note that Firefox 3.x`
		152	`// does this too.`
		153	`// - Double quote. In IE, these get normalized to`
		154	`// single-quotes, no matter what the encoding. (Fun`
		155	`// fact, in IE8, the 'content' CSS property gained`
		156	`// support, where they special cased to preserve encoded`
		157	`// double quotes, but still translate unadorned double`
		158	`// quotes into single quotes.) So, because their`
		159	`// fixpoint behavior is identical to single quotes, they`
		160	`// cannot be allowed either. Firefox 3.x displays`
		161	`// single-quote style behavior.`
		162	`// - Backslashes are reduced by one (so \\ -> \) every`
		163	`// iteration, so they cannot be used safely. This shows`
		164	`// up in IE7, IE8 and FF3`
		165	`// - Semicolons, commas and backticks are handled properly.`
		166	`// - The rest of the ASCII punctuation is handled properly.`
		167	`// We haven't checked what browsers do to unadorned`
		168	`// versions, but this is not important as long as the`
		169	`// browser doesn't /remove/ surrounding quotes (as IE does`
		170	`// for HTML).`
		171	`//`
		172	`// With these results in hand, we conclude that there are`
		173	`// various levels of safety:`
		174	`// - Paranoid: alphanumeric, spaces and dashes(?)`
		175	`// - International: Paranoid + non-ASCII Unicode`
		176	`// - Edgy: Everything except quotes, backslashes`
		177	`// - NoJS: Standards compliance, e.g. sod IE. Note that`
		178	`// with some judicious character escaping (since certain`
		179	`// types of escaping doesn't work) this is theoretically`
		180	`// OK as long as innerHTML/cssText is not called.`
		181	`// We believe that international is a reasonable default`
		182	`// (that we will implement now), and once we do more`
		183	`// extensive research, we may feel comfortable with dropping`
		184	`// it down to edgy.`
		185
		186	`// Edgy: alphanumeric, spaces, dashes, underscores and Unicode. Use of`
		187	`// str(c)spn assumes that the string was already well formed`
		188	`// Unicode (which of course it is).`
		189	`if (strspn($font, $this->mask) !== strlen($font)) {`
		190	`continue;`
		191	`}`
		192
		193	`// Historical:`
		194	`// In the absence of innerHTML/cssText, these ugly`
		195	`// transforms don't pose a security risk (as \\ and \"`
		196	`// might--these escapes are not supported by most browsers).`
		197	`// We could try to be clever and use single-quote wrapping`
		198	`// when there is a double quote present, but I have choosen`
		199	`// not to implement that. (NOTE: you can reduce the amount`
		200	`// of escapes by one depending on what quoting style you use)`
		201	`// $font = str_replace('\\', '\\5C ', $font);`
		202	`// $font = str_replace('"', '\\22 ', $font);`
		203	`// $font = str_replace("'", '\\27 ', $font);`
		204
		205	`// font possibly with spaces, requires quoting`
		206	`$final .= "'$font', ";`
		207	`}`
		208	`$final = rtrim($final, ', ');`
		209	`if ($final === '') {`
		210	`return false;`
		211	`}`
		212	`return $final;`
		213	`}`
		214
		215	`}`
		216
		217	`// vim: et sw=4 sts=4`

Proyectos de Subversion Moodle

(root)/lib/htmlpurifier/HTMLPurifier/AttrDef/CSS/FontFamily.php – Rev 1