ACTFL 2011 Feedback

I’ve just come back from presenting on the status of Open CAP at ACTFL. Since active development has all but stopped due to funding cuts, the presentation focused primarily on the intersection of test development, curriculum, and teaching. Towards the end of the session, attendees were asked to express any issues, concerns, or feature requests that they would like to see in a tool like Open CAP. Here are some of the comments:

  • Quality — Several people mentioned the need to separate the wheat from the chaff in terms of item quality. The ultimate usefulness of the tool depends on the quality of the content in addition to the quantity.
  • Attribution — An interesting comment was made that many online collaborations leave little trace of who made the original contribution. This makes it hard for faculty to quantify the contributions that they have made fro things such as tenure applications. To the extent that the application can track individual contributions, this problem will be minimized.
  • Reputation — Several attendees expressed a desire to have some kind of reputation system within the system in order to make it easier to determine which items has a better pedigree.
  • Accessibility — Given the potential size of the item bank, the need for an extensive tagging system was voiced.
  • Comparability — One attendee mentioned that it would be useful to see what types of items and tests were being used in peer institutions. This would help reduce search overload initially, as folks could start with things that were being used at similar institutions and adapt them from there.
  • Security — The issue of item exposure was mentioned, as well as the potential for student cheating if the system is completely open to any user.

Overall, the session attendees were very supportive of the idea of an online collaborative tool such as Open CAP. In the run-up to ACTFL, I also became familiar with a similar tool developed for university placement, called SLUPE.

Posted in Uncategorized | Leave a comment

The importance of feedback

In his book, Managing the Design Factory, Donald G. Reinertsen talks about the importance of generating high quality information early and often in any process. Although he is talking about manufacturing, these ideas apply to teaching and learning as well. Consider the following quote:

The key implication for management is that we must be concerned with both the efficiency with which we generate information and the timing of that information…The pressure to control testing costs often encourages us to use a few big “killer” tests, but such tests will delay the arrival of information. (Reinertsen, 2007, p. 73)

If you wait until the end of the term or the chapter to determine whether or not the students are understanding the material, there is little chance that those results will be used to actively improve the situation. It is better to gather information throughout the term. This is often called formative assessment, since feedback from the assessment can directly influence what happens next in the teaching/learning process.

Information collection can also be made more efficient by considering carefully what type of information to target. Rather than just giving “a reading test” or “a grammar quiz” based loosely on the general theme of the class, try to relate assessments to the particular goals and objectives of that lesson. For long term objectives such as improving speaking proficiency, break things down into smaller chunks.

 

Posted in Uncategorized | Leave a comment

LTRC 2011 Handout

OpenCAP was presented as a work-in-progress at LTRC 2011 in Ann Arbor, Michigan.  You can view the LTRC_2011_Handout. Discussions during the session primarily revolved around the issues of quality and usability. How can item quality be ensured while keeping the system usable and non-threatening for user with limited or no experience in test development?

Posted in development | Leave a comment

The technology of OpenCAP

OpenCAP is currently being developed using a number of open source tools. We are using the Python-based framework Django for application development. Django was chosen for it’s relative simplicity and the ability to get up and running quickly. The backend database is Postgres, which CASLS has used successfully in the past with other projects. OpenCAP will be running on GNU/Linux servers (Redhat). We use Jenkins as a build tool. Developers use Eclipse as the primary IDE.

Posted in development | Leave a comment

Mocking Open CAP

Mockup in Balsamiq

A picture is worth a thousand words. Or, in the case of software development, a mockup is worth a thousand pages of specifications. We have been using a tool called Balsamiq (http://balsamiq.com/) to create quick mockups of what some of the Open CAP screens might look like. The goal is not to create an aesthetically pleasing layout (for now), but rather to get a sense of how the user will interact with the application and what kind of input fields will be required. The mockup above shows how the idea of tag clouds and examples might be built into the application.

HTML mockup of item entry screen

Here is another mockup. This picture is not a “real” screen shot of Open CAP. It is just a simple HTML mockup cobbled together from an existing screen in our test engine and an example of tabbed content from the jQuery Javascript library (http://jquery.com/). If one were to click on the tabs for “Levels” or “Functions” on the right part of the screen, you would see “Lorem ipsum” text. However, even this simple mockup is enough to show how multiple “job aids” might be presented to a user.

Posted in development | Leave a comment

Data modeling

When building a house, an architect takes the home owner’s description of his or her dream home and translates it into detailed plans for a structurally sound dwelling. Every apect of the home, such as size of each room, where the bathrooms will be, and the like, needs to be determined. If the architect forgets to add the second bedroom or fails to account for the plumbing, there could be problems. Some things, such as the color of the walls, are fairly easy to change even after the house is built. Other things, such as converting a walk-in closet to a second bathroom, may take a lot more effort.

Creating a data model for a web application like Open CAP is somewhat like drawing up blueprints for a house — it forms the underlying plan of how concepts in the system relate to each other. The data model identifies the “things” in the program. For Open CAP, those “things” might be items, tests, login information, and the like. A database will be used to store information about those things, so the types of things and their relationship to each other must be defined. The computer doesn’t know anything about properties of tests plans, item specifications, or content reviews. The computer doesn’t know that a test is composed of items or that a class is composed of students unless we make those relationships explicit.

We’ve been spending time recently trying to capture the various pieces of information that will make up our data model for Open CAP. The goal is to try to make the data structures flexible enough to cope with a variety of situations, even if we choose not to fully implement features that take advantage of that flexibility right away. To return to our house analogy, an architect can design a house such that the hot water pipes run through the wall in the walk-in closet to give the homeowners the option of turning that closet into a bathroom at a later date.  We are trying to do the same thing for Open CAP.

Posted in development | Leave a comment

A comment about comments

We have been getting a lot of “bot”-generated comments recently, so please be patient if you submitted a comment and did not see it posted right away.

Posted in Uncategorized | Leave a comment

Core features

When designing a new product or educational tool, it is enjoyable to think about all of the possibilities, no matter how fanciful (“Gee, wouldn’t it be cool if we could automatically generate a daily report for each study abroad student using automatic speech recognition and GPS technology to track their proficiency gains during their trek across China?”).  At some point, however, those great plans need to be tempered by the realities of available resources. It would certainly be possible to try to develop a product by working down the list of great ideas one at a time until the money runs out, but that is probably not very efficient and there is no guarantee that an amalgamation of cool features will lead to a coherent product.

One tool to help prioritize work when resources are limited is the idea of core functionality. These are the absolutely essential criteria without which you don’t really have a product. For example, there are three things that a plane absolutely must be able to do:

  1. Take off
  2. Fly
  3. Land

No matter what else you do, if your development timeline doesn’t include the tasks necessary to make those three things a reality, you will not have a successful product at the end.

For Open CAP, the core functionality might be described as:

  1. make tests
  2. give tests
  3. see test results

Without the ability to perform those tasks, Open CAP has no chance of being useful for assessment. Of course, there mere presence of this core functionality doesn’t mean that Open CAP will be a great tool, but at least it will be complete at a core level.

Posted in Uncategorized | Leave a comment

When differences aren’t

I came across an interesting blog post (http://blog.gdinwiddie.com/2010/04/19/the-importance-of-precise-estimates/) about time estimation on software projects (not language related, but interesting in its own right for those involved in project management.)  One of the comments to the post included this gem:  ”Measure with a micrometer, mark with a crayon, cut with a chainsaw”. In other words, things can get lost in translation.

I think this sometimes goes the other way when it comes to education, especially when people suddenly find themselves looking at numerical scores. Numbers take on a life of their own and it seems to be human nature to treat differences in numbers, especially when reported to several decimal places, as important and in need of explanation. (When, in fact, the real explanation is that the original measure was not very precise and the differences are random error.)  Just imagine a test given “for fun” in Class A and Class B. However, when Excel shows that the average of Class A is 27.34 and that of Class B is 29.16, many people (especially administrators?) find they can’t help but wonder why Class B did “better” than Class A.

Perhaps we need to be wary of “Measure with an eyeball, report with a number, analyze with a microscope”.

Posted in Opinion | Tagged , | Leave a comment

“What test do you use?”

Anyone involved in a language program who has ever talked with counterparts from a different program have probably heard (or even asked) this question at some point. With limited resources, it does not make sense to reinvent the wheel if you don’t have to. Finding out what has worked in other programs makes perfect sense.

However, a certain amount of caution is warranted. Even though we might find it expedient to treat tests the way we treat duct tape (good for anything, including repairs to space vehicles – http://en.wikipedia.org/wiki/Duct_tape), that approach may not appropriate when it comes to educational assessment. The validity of a test is not an inherent property of the test itself which follows the Xeroxed copies from program to program. Rather, test validity refers to the meaning we attach to scores from the test and to the extent we can justify the interpretations we want to make. For example, although the road test to get a driver’s license might involve hand/eye coordination, we certainly wouldn’t want to use that test every time we needed to measure someone’s dexterity. Nor would we want to rely on a hand/eye coordination test as evidence that someone is capable of safely operating a motor vehicle on the highway.

This creates a potential problem for a community-based system like Open CAP, in which the goal is to leverage assessment content created by a variety of users.  Will Open CAP encourage rampant test misuse, as user virtually swap tests the way teachers swap ideas for class activities? What is the best way to ensure that Open CAP users create educationally sound assessments even as they recombine parts from different sources?

Posted in Uncategorized | Tagged | Leave a comment