An Updated Function for Newegg from The Feed Doctor

September 6, 2013

Marketplaces Anthony Alford By Anthony Alford

One of the things I try to do is
look for patterns, especially in business rules. If lots of people are writing
the same rule over and over, that’s an opportunity for us to make our
customers’ lives easier by turning those rules into their own functions or
devising new ways to address the issue.

Newegg_02For example, I had a couple of different people ask me for
help with a rule for their Newegg feeds. These marketplaces let you include
HTML in your product descriptions, but the HTML can only contain certain tags;
for example: “p,” ”b,” and “img,” but not “blink.” As you know, there is a
STRIPHTML function that removes all HTML, but that’s not what we want here. One
of our support engineers came up with an outstanding regular expression to do
this:

REGEXREPLACE($itemauctiondescription,
“<(?!ol)(?!/ol)(?!ul)(?!/ul)(?!li)(?!/li)(?!br)(?!/br)(?!b)(?!/b)(?!p)(?!/p)(?!i)(?!/i)(?!u)(?!/u)(?!em)(?!/em)(?!strong)(?!/strong)(?!sub)(?!/sub)(?!sup)(?!/sup).*?>”,
“”)

This clever rule uses an advanced regular expression feature
called “lookaround,” which we won’t go into here. Suffice to say this regex
scores 9.8 for style points. But good grief, we don’t want people to have to
type that every time they want to sell on Newegg! This is exactly why we create
new functions.

Paint Stripper

As it turns out, I wound up not having
to create a new function, but rather updating good ol’ STRIPHTML to have an
additional optional input. The old STRIPHTML still works, so STRIPHTML($itemauctiondescription)
will still output the description text with all HTML removed. However, if you
want STRIPHTML to keep certain tags, you can now give it a
comma-separated list of those tags, like this:

STRIPHTML($itemauctiondescription, “ol,ul,li,br,b,p,I,u,em,strong,sub,sup”)

This makes it much easier to manage
Newegg feeds. Also, it is flexible, since you can specify different lists of accepted
tags; that way, if another marketplace has a similar restriction, but with a
different list of tags, you just slightly tweak that list.

This modification to STRIPHTML is
now available per our latest software update. Enjoy!

Blogpost by Anthony Alford, The Feed Doctor @thefeeddoctor


Want more Feed Doctor Tips? Check out these other posts:

The Feed Doctor Cropped

And the Oscar goes to…The Feed Doctor for his work on REGEXGET

The Feed Doctor’s New Business Rule Functions

A Dose of Tips from The Feed Doctor

A Holiday Gift from The Feed Doctor