TEXT   22

css-testing.txt

Guest on 12th September 2021 06:50:33 AM

  1. TESTING CSS IN MOZILLA: WHERE TO GO FROM HERE
  2. =============================================
  3.  
  4. Version 1.0
  5.  
  6. INTRODUCTION
  7. ------------
  8.  
  9. AIMS
  10.  
  11. The primary aim of Quality Assurance (QA) is to make the product
  12. better by reporting failures in sufficient detail for the engineering
  13. team to then fix these failures.
  14.  
  15. In order to report failures, they first have to be found. Finding
  16. failures forms the bulk of the work done by QA. There are several ways
  17. of finding failures, and these are discussed in detail in the next
  18. section. Testing the product directly is the most obvious way of
  19. finding failures, but there are several other techniques, for instance
  20. reading user feedback (e.g. from beta programmes).
  21.  
  22.  
  23. CSS
  24.  
  25. Cascading Style Sheets (CSS) is a simple technology for styling web
  26. pages. It is designed to allow an easy separation of the stylistic
  27. aspects of a document (e.g. "green bold text") from the structural and
  28. semantical aspects of a document (e.g. "section header").
  29.  
  30. CSS is based on two fundamental concepts.
  31.  
  32. The first concept is that CSS is a tree decoration language. CSS
  33. defines a list of properties, for example 'color' or 'font-size'.
  34. Applying CSS style sheets to a tree causes each node in the tree to
  35. have a specified value for each of these properties.
  36.  
  37. The second concept is the CSS rendering model. This describes how
  38. blocks, tables, text and other layout elements are displayed, how
  39. fonts are selected, and so on.
  40.  
  41. The mapping of the decorated tree into the rendering model is what
  42. forms the majority of the CSS specification.
  43.  
  44. CSS was originally invented in 1995 by Hakon Lie and Bert Bos, and
  45. became a World Wide Web (W3C) Recommendation in late 1996. In 1998 a
  46. second version was released, and since then much progress has been
  47. made on a third version, which will be split into many modules for
  48. both political and technical reasons, and on publishing extensive
  49. errata for the published versions based on implementation experience.
  50.  
  51. The CSS Working Group, which is responsible for this work, consists of
  52. representatives from various different implementors of CSS user
  53. agents, users of the technology, and other interested parties.
  54.  
  55. Further reading: http://www.w3.org/Style/CSS/
  56.  
  57.  
  58. MOZILLA
  59.  
  60. Mozilla is a free software internet application suite. It forms the
  61. basis of products such as the Netscape browser, the "Instant AOL"
  62. consumer device, and the ActiveState Komodo IDE.
  63.  
  64. Further reading: http://www.mozilla.org/
  65.  
  66.  
  67. TESTING METHODOLOGY
  68. -------------------
  69.  
  70. TYPES OF COVERAGE ("WHEN")
  71.  
  72. There are several ways of finding failures. These form a broad
  73. spectrum of test coverage, ranging from the unsuspecting public to the
  74. simplest of targeted test cases. Let us examine each in turn.
  75.  
  76.    * Real world coverage. This is the ultimate test. Ideally, end
  77.      users would never experience software defects (bugs) but,
  78.      unfortunately, writing software as complex as a web browser on a
  79.      tight schedule inevitably means a compromise must be made between
  80.      perfection and shipping the product before the company goes
  81.      bankrupt. In the case of CSS, bugs in the last released version
  82.      of the product are sometimes reported by web developers in
  83.      official feedback forms or in public forums. These are an
  84.      important source of bug reports, but in practice the signal to
  85.      noise ratio is usually too high to warrant spending much time
  86.      here. (Typically, issues reported in official feedback forms and
  87.      public forums are well known issues, or even incorrect reports.)
  88.  
  89.    * Pre-release beta testing. Errors reported in widely distributed
  90.      releases of non-final versions of the product are often known
  91.      issues (as with bugs reported in final versions) but by examining
  92.      how many times issues are reported the most important bugs can be
  93.      prioritized before the final version is released.
  94.  
  95.    * Dogfood. It is good practice to use the software one is
  96.      developing on a daily basis, even when one is not actively
  97.      working on developing or testing the product. This is known as
  98.      "eating one's dogfood". Many bugs, usually user interface (UI)
  99.      issues but also occasionally web standards bugs, are found by
  100.      people using daily builds of the product while not actively
  101.      looking for failures. A bug found using this technique which
  102.      prevents the user of the product on a regular basis is called a
  103.      "dogfood" bug and is usually given a very high priority. [2]
  104.  
  105.    * "top100" and site-specific testing. The web's most popular pages
  106.      are regularly checked by visual inspection to ensure that they
  107.      display correctly. (It is hard, if not impossible, to automate
  108.      this task, because these pages change very, very frequently.)
  109.      Bugs found through this technique are important, because many
  110.      users would encounter them should a product be released with such
  111.      a defect. In practice, many rendering issues found on top100
  112.      pages are actually caused by errors on the pages themselves, for
  113.      example using invalid CSS or CSS which is incorrectly handled by
  114.      other browsers. Most web authors do not check the validity of
  115.      their pages, and assume that if the page "looks right" on popular
  116.      browsers, it must be correct.
  117.  
  118.    * Smoketests. Each day, before allowing work to begin on the
  119.      code base, the previous day's work must pass an extremely simple
  120.      set of tests known as "smoketests". The name comes from the idea
  121.      that these tests are the software equivalent of powering a new
  122.      prototype circuit and seeing if it catching fire! CSS does not
  123.      figure very prominently on the smoketest steps, so few, if any,
  124.      CSS bugs are caught this way. However, since CSS underpins a
  125.      large part of the application's UI, if anything is seriously
  126.      wrong with the CSS infrastructure, it will be caught by the
  127.      smoketests. Bugs found this way are known as "smoketest
  128.      blockers", and with good reason: all work is blocked until the
  129.      bugs are fixed. This is to ensure that the bugs are fixed
  130.      quickly.
  131.  
  132.    * Tinderbox tests. Tests are also run on a continuous basis on a
  133.      system known as the tinderbox. This tests the absolute latest
  134.      code base, and therefore is a good way of catching unexpected
  135.      compile errors (code that works on one platform might not work on
  136.      another) and major problems such as startup failures. There are
  137.      no CSS tests currently being run on the tinderboxes, however this
  138.      is a direction which will be worth pursuing in the future.
  139.  
  140.    * Automated tests. Some tests have been adapted for a test harness
  141.      known as NGDriver. These tests run unattended and can therefore
  142.      cover large areas of the product with minimum effort. Until
  143.      recently, CSS could not easily be tested using an automation
  144.      system. However, with the advent of the Layout Automation System
  145.      (LAS), there now exists a test harness that is capable of
  146.      displaying a test page and then comparing this test page, pixel
  147.      for pixel, with a pre-stored image.
  148.  
  149.      Automation is the holy grail of QA. Unfortunately, there are many
  150.      aspects that are hard to impossible to automate, such as
  151.      printing.
  152.  
  153.    * Engineer regression and pre-checkin tests. In order to catch
  154.      errors before they are flagged on the tinderbox (and thus wasting
  155.      a lot of time) engineers must run a set of tests before
  156.      committing their changes to the main code base. These tests are
  157.      known as 'pre-checkin tests'. Certain changes also require that
  158.      the new code be run through specially designed regression tests
  159.      that are written to flag any regressions (new defects which were
  160.      not present in a previous version) in the new code.
  161.  
  162.    * Manual test runs. Before releases, and occasionally at other
  163.      times as well (for instance when a new operating system is
  164.      released) every test case is run through a test build of the
  165.      product and manually inspected for errors. This is a very time
  166.      consuming process, but (assuming the person running the tests is
  167.      familiar with them and the specification being tested) it is a
  168.      very good way of catching regressions.
  169.  
  170.    * QA test development. The main way of discovering bugs is the
  171.      continuous creation of new test cases. These tests then get added
  172.      either to the manual test case lists or the automated test case
  173.      lists, so that they can flag regressions if they occur.
  174.  
  175.    * Engineer test development. When a bug is discovered, the file
  176.      showing this bug is then reduced to the smallest possible file
  177.      still reproducing the bug. This enables engineers to concentrate
  178.      on the issue at hand, without getting confused by other issues.
  179.      Oddly enough, during this process it is not unusual to discover
  180.      multiple other related bugs, and this is therefore an important
  181.      source of bug reports.
  182.  
  183.  
  184. TYPES OF TEST CASES ("WHERE")
  185.  
  186. If the various techniques for finding failures gives a list of _when_
  187. bugs are typically found, then the various different kinds of test
  188. cases gives a list of _where_ the failures are found.
  189.  
  190.    * The original files of a bug found in the field. Typically, bugs
  191.      reported by end users and beta testers will simply consist of the
  192.      web address (URI) of the page showing the problem. Similarly,
  193.      bugs reported by people while using daily builds (dogfood
  194.      testing) and bugs found on top100 pages will consist of entire
  195.      web pages.
  196.  
  197.      An entire web page is usually not very useful to anyone by
  198.      itself. If a bug is found on a web page, then the web site will
  199.      have to be turned into a reduced test to be useful for developers
  200.      (engineer test development). This will then typically then be
  201.      used as the basis for a group of more complicated QA tests for
  202.      regression testing (and maybe finding more bugs in that area).
  203.  
  204.      Example:
  205.         http://www.cnn.com/ or any other web site.
  206.  
  207.    * Reduced test (attached to a bug). In order for an engineer to
  208.      find the root cause of a bug, it is helpful if the original page
  209.      which demonstrates the bug is simplified to the point where no
  210.      unrelated material is left. This is known as minimizing or
  211.      reducing a test case, and forms part of engineer test
  212.      development.
  213.  
  214.      Reduced tests are typically extremely small (less than one
  215.      kilobyte including any support files) and extremely simple. There
  216.      are obviously exceptions to these rules, for example tests for
  217.      bugs that only manifest themselves with one megabyte files will
  218.      be big (although still simple) and bugs that are only visible
  219.      with a convoluted set of conditions will be complicated (although
  220.      still small).
  221.  
  222.      A good reduced test will also be self explanatory, if that can be
  223.      done without adding text which would be unrelated to the test.
  224.  
  225.      Examples:
  226.         http://bugzilla.mozilla.org/attachment.cgi?id=25713&action=view
  227.         http://bugzilla.mozilla.org/attachment.cgi?id=39662&action=view
  228.  
  229.    * Simple test. When the implementation of a feature is in its
  230.      infancy, it is useful to create a few simple tests to check the
  231.      basics. These tests are also known as isolation tests (since the
  232.      features are typically tested in isolation) or ping tests (in
  233.      computing terms, to ping something means to check that it is
  234.      alive).
  235.  
  236.      Simple tests consist of a test as simple as a reduced test, but
  237.      designed to be easy for QA to use, rather than for engineers, and
  238.      therefore may have the appearance of a complicated test.
  239.  
  240.      Simple tests are often used as part of a complicated test. When
  241.      used in this way they are known as a control test, with analogy
  242.      to the concept of a control in experimental physics. If the
  243.      control test fails, then the rest of the test is to be considered
  244.      irrelevant. For example, if a complicated test uses colour
  245.      matching to test various colour related properties, then a good
  246.      control test would be one testing that the 'color' property is
  247.      supported at all.
  248.  
  249.      Example:
  250.         http://www.hixie.ch/tests/adhoc/css/cascade/style/001.xml
  251.      
  252.    * Complicated test. This is the most useful type of test for QA,
  253.      and is the type of test most worth writing. One well written
  254.      complex test can show half a dozen bugs, and can therefore they
  255.      are worth many dozens of simple tests. since for a complicated
  256.      feature to work, the simpler features it uses must all work too.
  257.  
  258.      Complicated tests should appear to be very simple, but their
  259.      markup can be quite convoluted since it is typically testing
  260.      combinations of several things at once. The next chapter
  261.      describes how to write these tests.
  262.  
  263.      Example:
  264.         http://www.hixie.ch/tests/adhoc/css/selectors/not/006.xml
  265.  
  266.    * Use case demo page. Occasionally, pages will be written to
  267.      demonstrate a particular feature. Pages like this are written for
  268.      various reasons -- they are written by marketing teams to show
  269.      users the new features of a product, they are written by
  270.      technology evangelism teams to show web developers features that
  271.      would make their site more interesting or to answer frequently
  272.      asked questions about a particular feature, sometimes they are
  273.      even written for fun! CSS, due to its very graphical nature, has
  274.      many demo pages.
  275.  
  276.      Demo pages are really another kind of complicated test, except
  277.      that because the target audience is not QA it may take longer to
  278.      detect failures and reduce them to useful tests for engineers.
  279.  
  280.      Example:
  281.        http://damowmow.com/mozilla/demos/layout/demo.html
  282.  
  283.    * Extremely complicated demo page. When an area has been
  284.      highlighted as needing a lot of new complicated tests, it may be
  285.      hard to decide where to begin working. To help decide, one can
  286.      attempt to write an entire web site using the feature in question
  287.      (plus any others required for the site). During this process, ANY
  288.      bug discovered should be noted, and then used as the basis for
  289.      complicated tests.
  290.  
  291.      This technique is surprisingly productive, and has the added
  292.      advantage of discovering bugs that will be hit in real web sites,
  293.      meaning that it also helps with prioritisation.
  294.  
  295.      At least two web sites exist purely to act as extremely
  296.      complicated demo pages:
  297.         http://www.libpr0n.com/
  298.         http://www.mozillaquestquest.com/
  299.  
  300.    * Automated tests. These are used for the same purposes as
  301.      complicated tests, except that they are then added to automated
  302.      regression test suites rather than manual test suites.
  303.  
  304.      Typically the markup of Automated Tests is impenetrable to anyone
  305.      who hasn't worked on them, due to the peculiarities of the test
  306.      harness used for the test. This means that when a bug is found on
  307.      an automated test, reducing it to a reduced test can take a long
  308.      time, and sometimes it is easier to just use automated tests as a
  309.      pointer for running related complicated tests. This is suboptimal
  310.      however, and well designed automated tests have clear markings in
  311.      the source explaining what should be critical to reproducing the
  312.      test without its harness.
  313.  
  314.      The worst fear of someone running automated tests is that a
  315.      failure will be discovered that can only be reproduced with the
  316.      harness, as reducing such a bug can take many hours due to the
  317.      complexities of the test harnesses (for example the interactions
  318.      with the automation server).
  319.  
  320.      Example:
  321.         http://www.hixie.ch/tests/ngdriver/domcss/sc2p004.html
  322.  
  323.  
  324. FINDING BUGS
  325.  
  326. The following flowchart is a summary of this section.
  327.  
  328.                                      BETA FEEDBACK
  329.                                            |
  330.     EXTREMELY                             \|/
  331.    COMPLICATED        USER FEEDBACK --> WEB SITE <-- DOGFOOD
  332.     DEMO PAGE                              |
  333.         |                                  |
  334.         |                                  |
  335.        \|/                                \|/
  336.      LIST OF -----> COMPLICATED ------> REDUCED
  337.       BUGS   <-----    TESTS             TEST
  338.        /|\            |                    |
  339.         |            \|/                  \|/
  340.         AUTOMATED TESTS                BUG FILED
  341.  
  342.  
  343.  
  344. CSS: WHAT HAS BEEN DONE SO FAR
  345. ------------------------------
  346.  
  347. EXISTING COVERAGE (AS OF SEPTEMBER 2001)
  348.  
  349. CSS1 is thoroughly covered and methodical testing at this stage would
  350. not give a tests-to-bugs ratio that is worth the time investment. The
  351. only exception would be the list related properties.
  352.  
  353. CSS2 is less thoroughly covered. Positioning, tables, generated
  354. content and the font matching algorithm have had little testing.
  355.  
  356. Selectors, the cascade, syntax, the block box model, the inline box
  357. model, floats, colour and background related properties, the text
  358. properties, and the font properties are all well covered.
  359.  
  360. Current tests are spread across many test suites, including:
  361.  
  362.    * http://www.hixie.ch/tests/adhoc/
  363.      A large selection of complicated tests designed for ease of use
  364.      by QA.
  365.  
  366.    * http://www.hixie.ch/tests/evil/
  367.      Some very complicated tests and test generators. These tests are
  368.      designed more with exploratory testing in mind -- in some cases,
  369.      it is not even clear what the correct behaviour should be.
  370.  
  371.    * http://www.people.fas.harvard.edu/~dbaron/csstest/
  372.      A set of complicated tests. Some of these tests require careful
  373.      study and are not designed for use by QA.
  374.  
  375.    * http://www.bath.ac.uk/~py8ieh/internet/results.html
  376.      An earlier set of complicated tests. Most of these tests are very
  377.      descriptive, and are therefore quite useful when learning CSS.
  378.      This test suite has some tests that examine some fundamental, if
  379.      complicated, aspects of CSS, such as the inline box model and the
  380.      'width' and 'height' properties.
  381.  
  382.    * http://www.w3.org/Style/CSS/Test/current/
  383.      The official W3C CSS1 Test Suite.
  384.  
  385.  
  386. NEW TESTS
  387.  
  388. The majority of new tests should be in the areas listed as lacking
  389. tests in the previous section. These are the areas that have the least
  390. support in Mozilla.
  391.  
  392. With the recent advent of LAS, the automation system for layout tests,
  393. it would be a good idea to work on automating the many manual tests
  394. already in existence. Having done this, linking the automation with
  395. the tinderbox tests would give a good advance warning of regressions.
  396.  
  397.  
  398. WRITING MANUAL QA TESTS FOR CSS
  399. -------------------------------
  400.  
  401. HOW MANUAL QA TESTS ARE USED
  402.  
  403. Tests are viewed one after the other in quick succession, usually in
  404. groups of several hundred to a thousand. As such, it is vital that:
  405.  
  406.    * the results be easy to determine,
  407.    * the tests need no more than a few seconds to convey their results
  408.      to the tester,
  409.    * the tests not need an understanding of the spec to use them.
  410.  
  411. A badly written test can lead to the tester not noticing a regression,
  412. as well as breaking the tester's concentration.
  413.  
  414.  
  415. IDEAL TESTS
  416.  
  417. Well designed CSS tests typically fall into the following categories,
  418. named after the features that the test will have when correctly
  419. rendered by a user agent (UA).
  420.  
  421. Note: The terms "the test has passed" and "the test has failed" refer
  422. to whether the user agent has passed or failed a particular test -- a
  423. test can pass in one web browser and fail in another. In general, the
  424. language "the test has passed" is used when it is clear from context
  425. that a particular user agent is being tested, and the term
  426. "this-or-that-user-agent has passed the test" is used when multiple
  427. user agents are being compared.
  428.  
  429.    * The green paragraph. This is the simplest form of test, and is
  430.      most often used when testing the parts of CSS that are
  431.      independent of the rendering, like the cascade or selectors. Such
  432.      tests consist of a single line of text describing the pass
  433.      condition, which will be one of the following:
  434.  
  435.        This line should be green.
  436.        This line should have a green border.
  437.        This line should have a green background.
  438.  
  439.      Example:
  440.         http://www.hixie.ch/tests/adhoc/css/box/inline/002.html
  441.         http://www.hixie.ch/tests/adhoc/css/background/20.xml
  442.  
  443.    * The green page. This is a variant on the green paragraph test.
  444.      There are certain parts of CSS that will affect the entire page,
  445.      when testing these this category of test may be used. Care has to
  446.      be taken when writing tests like this that the test will not
  447.      result in a single green paragraph if it fails. This is usually
  448.      done by forcing the short descriptive paragraph to have a neutral
  449.      colour (e.g. white).
  450.  
  451.      Example:
  452.         http://www.hixie.ch/tests/adhoc/css/background/18.xml
  453.      (This example is poorly designed, because it does not look red
  454.      when it has failed.)
  455.  
  456.    * The green block. This is the best type of test for cases where a
  457.      particular rendering rule is being tested. The test usually
  458.      consists of two boxes of some kind that are (through the use of
  459.      positioning, negative margins, zero line height, or other
  460.      mechanisms) carefully placed over each other. The bottom box is
  461.      coloured red, and the top box is coloured green. Should the top
  462.      box be misplaced by a faulty user agent, it will cause the red to
  463.      be shown. (These tests sometimes come in pairs, one checking that
  464.      the first box is no bigger than the second, and the other
  465.      checking the reverse.)
  466.  
  467.      Examples:
  468.         http://www.hixie.ch/tests/adhoc/css/box/absolute/001.xml
  469.         http://www.hixie.ch/tests/adhoc/css/box/table/010.xml
  470.  
  471.    * The green paragraph and the blank page. These tests appear to be
  472.      identical to the green paragraph tests mentioned above. In
  473.      reality, however, they actually have more in common with the
  474.      green block tests, but with the green block coloured white
  475.      instead. This type of test is used when the displacement that
  476.      could be expected in the case of failure is likely to be very
  477.      small, and so any red must be made as obvious as possible.
  478.      Because of this, test would appear totally blank when the test
  479.      has passed. This is a problem because a blank page is the symptom
  480.      of a badly handled network error. For this reason, a single line
  481.      of green text is added to the top of the test, reading something
  482.      like:
  483.  
  484.        This line should be green and there should be no red on this
  485.        page.
  486.  
  487.      Examples:
  488.         http://www.hixie.ch/tests/adhoc/css/fonts/size/002.xml
  489.  
  490.    * The two identical renderings. It is often hard to make a test
  491.      that is purely green when the test passes and visibly red when
  492.      the test fails. For these cases, it may be easier to make a
  493.      particular pattern using the feature that is being tested, and
  494.      then have a reference rendering next to the test showing exactly
  495.      what the test should look like.
  496.  
  497.      The reference rendering could be either an image, in the case
  498.      where the rendering should be identical, to the pixel, on any
  499.      machine, or the same pattern made using totally different parts
  500.      of the CSS specification. (Doing the second has the advantage of
  501.      making the test a test of both the feature under test and the
  502.      features used to make the reference rendering.)
  503.  
  504.      Examples:
  505.         http://www.hixie.ch/tests/adhoc/css/box/block/003.html
  506.         http://www.hixie.ch/tests/adhoc/css/box/table/003.html
  507.         http://www.hixie.ch/tests/adhoc/css/box/ib/002.xml
  508.  
  509.    * The positioned text. There are some cases where the easiest test
  510.      to write is one where the four letters of the word 'PASS' are
  511.      individually positioned on the page. This type of test is then
  512.      said to have passed when all that can be seen is the word with
  513.      all its letters aligned. Should the test fail, the letters are
  514.      likely to go out of alignment, for instance:
  515.  
  516.        PA
  517.          SS
  518.  
  519.      ...or:
  520.  
  521.        SSPA
  522.  
  523.      The problem with this test is that when there is a failure it is
  524.      sometimes not immediately clear that the rendering is wrong.
  525.      (e.g. the first example above could be thought to be
  526.      intentional.)
  527.  
  528.      Example:
  529.         http://www.hixie.ch/tests/adhoc/css/box/block/text-indent/001.html
  530.  
  531. Ideal tests, as well as having well defined characteristics when they
  532. pass, should have some clear signs when they fail. It can sometimes be
  533. hard to make a test do something only when the test fails, because it
  534. is very hard to predict how user agents will fail! Furthermore, in a
  535. rather ironic twist, the best tests are those that catch the most
  536. unpredictable failures!
  537.  
  538. Having said that, here are the best ways to indicate failures:
  539.  
  540.    * Red. This is probably the best way of highlighting bugs. Tests
  541.      should be designed so that if the rendering is a few pixels off
  542.      some red is uncovered.
  543.  
  544.      Examples:
  545.         http://www.hixie.ch/tests/adhoc/css/box/block/first-line/001.html
  546.  
  547.    * Overlapped text. Tests of the 'line-height', 'font-size' and
  548.      similar properties can sometimes be devised in such a way that a
  549.      failure will result in the text overlapping.
  550.  
  551.    * The word "FAIL". Some properties lend themselves well to this
  552.      kind of test, for example 'quotes' and 'content'. The idea is
  553.      that if the word "FAIL" appears anywhere, something must have
  554.      gone wrong.
  555.  
  556.      Examples:
  557.         http://www.hixie.ch/tests/adhoc/css/box/table/004.html
  558.         http://www.hixie.ch/tests/adhoc/css/box/absolute/002.xml
  559.  
  560.    * Scrambled text. This is similar to using the word "FAIL", except
  561.      that instead of (or in addition to) having the word "FAIL" appear
  562.      when an error is made, the rest of the text in the test is
  563.      generated using the property being tested. That way, if anything
  564.      goes wrong, it is immediately obvious.
  565.  
  566.      Examples:
  567.         http://www.hixie.ch/tests/adhoc/css/quotes/001.xml
  568.  
  569. These are in addition to those inherent to the various test types,
  570. e.g., differences in the two halves of a two identical renderings test
  571. obviously also shows a bug.
  572.  
  573.  
  574. TESTS TO AVOID
  575.  
  576.    * The long test. Any manual test that is so long that is needs to
  577.      be scrolled to be completed is too long. The reason for this
  578.      becomes obvious when you consider how manual tests will be run.
  579.      Typically, the tester will be running a program (such as
  580.      "Loaderman") which cycles through a list of several hundred
  581.      tests. Whenever a failure is detected, the tester will do
  582.      something (such as hit a key) that takes a note of the test case
  583.      name. Each test will be on the screen for about two or three
  584.      seconds. If the tester has to scroll the page, that means he has
  585.      to stop the test to do so.
  586.  
  587.      Of course, there are exceptions -- the most obvious one being any
  588.      tests that examine the scrolling mechanism! However, these tests
  589.      are considered tests of user interaction and are not run with the
  590.      majority of the tests.
  591.  
  592.      In general, any test that is so long that it needs scrolling can
  593.      be split into several smaller tests, so in practice this isn't
  594.      much of a problem.
  595.  
  596.      This is an example of a test that is too long:    
  597.          http://www.bath.ac.uk/~py8ieh/internet/eviltests/lineheight3.html
  598.  
  599.    * The counter intuitive "this should be red" test. As mentioned
  600.      many times in this document, red indicates a bug, so nothing
  601.      should ever be red in a test.
  602.  
  603.      There is one important exception to this rule... the test for the
  604.      'red' value for the colour properties!
  605.  
  606.      The first subtest on this page shows this problem:
  607.         http://www.people.fas.harvard.edu/~dbaron/css/test/childsel
  608.  
  609.    * Unobvious tests. A test that has half a sentence of normal text
  610.      with the second half bold if the test has passed is not very
  611.      obvious, even if the sentence in question explains what should
  612.      happen.
  613.  
  614.      There are various ways to avoid this kind of test, but no general
  615.      rule can be given since the affected tests are so varied.
  616.  
  617.      The last subtest on this page shows this problem:
  618.         http://www.w3.org/Style/CSS/Test/current/sec525.htm
  619.  
  620. TECHNIQUES
  621.  
  622. In addition to the techniques mentioned in the previous sections,
  623. there are some techniques that are important to consider or to
  624. underscore.
  625.  
  626.    * Overlapping. This technique should not be cast aside as a
  627.      curiosity -- it is in fact one of the most useful techniques for
  628.      testing CSS, especially for areas like positioning and the table
  629.      model.
  630.  
  631.      The basic idea is that a red box is first placed using one set of
  632.      properties, e.g. the block box model's margin, height and width
  633.      properties, and then a second box, green, is placed on top of the
  634.      red one using a different set of properties, e.g. using absolute
  635.      positioning.
  636.  
  637.      This idea can be extended to any kind of overlapping, for example
  638.      overlapping to lines of identical text of different colours.
  639.  
  640.    * Special Fonts. Todd Fahrner has developed a font called Ahem,
  641.      which consists of some very well defined glyphs of precise sizes
  642.      and shapes. This font is especially useful for testing font and
  643.      text properties. Without this font it would be very hard to use
  644.      the overlapping technique with text.
  645.  
  646.      Examples:
  647.         http://www.hixie.ch/tests/adhoc/css/fonts/ahem/001.xml
  648.         http://www.hixie.ch/tests/adhoc/css/fonts/ahem/002.xml
  649.  
  650.    * The self explanatory sentence followed by pages of identical
  651.      text. For tests that must be long (e.g. scrolling tests), it is
  652.      important to make it clear that the filler text is not relevant,
  653.      otherwise the tester may think he is missing something and
  654.      therefore waste time reading the filler text. Good text for use
  655.      in these situations is, quite simply, "This is filler text. This
  656.      is filler text. This is filler text.". If it looks boring, it's
  657.      working!
  658.  
  659.    * Colour. In general, using colours in a consistent manner is recommend.
  660.      Specifically, the following convention has been developed:
  661.  
  662.        Red: Any red indicates failure.
  663.  
  664.        Green: In the absence of any red, green indicates success.
  665.  
  666.        Blue: Tests that do not use red or green to indicate success or
  667.          failure should use blue to indicate that the tester should
  668.          read the text carefully to determine the pass conditions.
  669.  
  670.        Black: Descriptive text is usually black.
  671.  
  672.        Fuchsia, Yellow, Teal: These are useful colours when making
  673.          complicated patterns for tests of the two identical
  674.          renderings type.
  675.  
  676.        Gray: Descriptive lines, such as borders around nested boxes,
  677.          are usually light gray. These lines come in useful when
  678.          trying to reduce the test for engineers. Dark gray is
  679.          sometimes used for filler text to indicate that it is
  680.          irrelevant.
  681.  
  682.      Here is an example of blue being used:
  683.         http://www.hixie.ch/tests/adhoc/css/fonts/size/004.xml
  684.  
  685.    * Methodical testing. There are particular parts of SS that can be
  686.      tested quite thoroughly with a very methodical approach. For
  687.      example, testing that all the length units work for each property
  688.      taking lengths is relatively easy, and can be done methodically
  689.      simply by creating a test for each property/unit combination.
  690.  
  691.      In practice, the important thing to decide is when to be
  692.      methodical and when to simply test, in an ad hoc fashion, a cross
  693.      section of the possibilities.
  694.  
  695.      This example is a methodical test of the :not() pseudo-class with
  696.      each attribute selector in turn, first for long values and then
  697.      for short values:
  698.         http://www.hixie.ch/tests/adhoc/css/selectors/not/010.xml
  699.  
  700.  
  701.  
  702. GLOSSARY
  703. --------
  704.  
  705. There are many terms which will be encountered when writing or using
  706. tests for CSS. This list is by no means complete, but should give the
  707. reader a head start.
  708.  
  709.    * Full support. The unachievable goal of perfection. A user agent
  710.      which claims to have "full support" for a specification is
  711.      claiming the impossible. In addition to the great difficulty in
  712.      attaining "full support" there is the problem that the
  713.      specification itself currently has some minor contradictions, and
  714.      therefore *cannot* be fully implemented.
  715.  
  716.    * 100% Support. See full support. Note that Microsoft claim that
  717.      Internet Explorer has "100% support for CSS1" while meaning that
  718.      Internet Explorer passes the majority of the tests explicitly
  719.      mentioned in the W3C CSS1 Test Suite that test the CSS1 core
  720.      properties and that are not controversial.
  721.    
  722.    * Best support. At all times, one particular user agent will have
  723.      the best implementation of CSS. There is a quite friendly and
  724.      healthy rivalry between the competing implementors to beat the
  725.      others in terms of CSS support, and this is probably the main
  726.      reason for increased support in recent releases of the main
  727.      browsers.
  728.  
  729.    * Complete support. See full support.
  730.  
  731.    * Compliant implementation. Claiming to be a compliant
  732.      implementation is not as bold as claiming full support, but is
  733.      just as unlikely to be true. The main difference between a full
  734.      implementation and a compliant implementation is that the
  735.      specification lists certain aspects as being optional, and
  736.      therefore one can legitimately fail to implement those parts.
  737.  
  738.    * Comprehensive testing. A feature has been comprehensively tested
  739.      if every possible combination has been tested. This is generally
  740.      impossible unless the feature is very well defined. For example,
  741.      testing all possible style sheets to ensure that they are all
  742.      correctly parsed is impossible, because it would take longer to
  743.      do that than the estimated lifetime of the universe. However, it
  744.      is possible (although rather pointless) to perform that exercise
  745.      for all one byte style sheets.
  746.  
  747.    * Exhaustive testing. See comprehensive testing.
  748.  
  749.    * Implementation. See user agent.
  750.  
  751.    * Methodical testing. This is the antithesis of ad hoc testing.
  752.      Methodical testing is the act of taking a set of possible input
  753.      values, and enumerating all permutations, creating a test for
  754.      each. (Due to the mechanical nature of this process, it is common
  755.      to create such tests using some sort of script.)
  756.  
  757.    * Thorough testing. A feature is said to have been thoroughly
  758.      tested if it is believed that a reasonably large and well
  759.      distributed cross section of possible combinations has been
  760.      tested. This is no guarantee that no bugs are lurking in the
  761.      untested cases, of course!
  762.  
  763.    * User agent. A web browser. Technically, a user agent can be more
  764.      than just a web browser -- any application that processes CSS is
  765.      a user agent of some kind. For example, a CSS validator would
  766.      classify as a user agent.
  767.  
  768.  
  769. REFERENCES
  770. ----------
  771.  
  772. [1] Cascading Style Sheets:
  773.     http://www.w3.org/TR/REC-CSS2/
  774. [2] Dogfood in the Jargon File:
  775.     http://www.tuxedo.org/~esr/jargon/html/entry/dogfood.html
  776.  
  777.  
  778. CONTRIBUTORS
  779. ------------
  780.  
  781. Ian Hickson <ian@hixie.ch>

Raw Paste


Login or Register to edit or fork this paste. It's free.